将进酒配乐NEURAL NETWORKS IN SPONTANEOUS
初雪的歌词
SPEECH ASSESSMENT OF
DYSPHASIC PATIENTS
Sameer Singh
School of Computing
University of Plymouth
Plymouth PL4 8AA
email: s1singh@plym.ac.uk
Abstract
Neural networks can be successfully used for the classification of dysphasic subjects based on their conversational speech using a set of linguistic measures. I shall illustrate the approach with particular reference to its application in classifying agrammatic patients. Linguistic measures can be applied to th
song 2e transcribed texts of conversational speech of both normal and agrammatic subjects and will quantify the availability of linguistic features which are dependent on word-frequency. The paper presents the results of a cross-validation study using neural networks which take linguistic measurements as inputs for classifying moderate Broca’s aphasics and normal subjects and compares them against those obtained by using a linear discriminant analysis on the same data.
Keywords:Agrammatism, Linguistic analysis, Data transcription, Cross-
validation, Artificial neural networks, Linear discriminant analysis
______________________________
适合班歌的歌曲S. Singh. Neural Networks in Spontaneous Speech Assessment of Dysphasic Patients, Proc. 2nd International Conference on Neural Networks and Expert Systems in Medicine and Healthcare, Plymouth, UK, pp. 57-64 (26-28 August, 1996).
1.Assessment of Spontaneous Speech
It has been widely recognized in the last few years that the quantification of conversational speech of dysphasic adults is important for their proper assessment and treatment. Consequently such analyse
s have been recently an important topic of research both at grammatical and pragmatic level. Linguistic analysis of agrammatic patients allows their scores to be calculated reliably and objectively and can be used to calculate a final index of performance on a uni-dimensional scale of linguistic ability. The quantification of conversational skills over a period of time for the agrammatic stroke patients is important to estimate their rate of recovery. This recovery can be successfully predicted using neural networks, [1]. Code, Rowley and Kertesz (1994) have shown that test scores obtained through formal assessment of dysphasic patients on Western Aphasia Battery at 3 months post-onset can be used for reliably predicting scores 12 months post-onset using neural networks. Apart from predicting recovery, neural networks can also be used for classifying normal and disordered performances as separate and for distinguishing different types of dysphasia using linguistic inputs.
Agrammatism, a type of language disorder, results from lesions in the left-hemisphere of the brain which often damages Broca’s area [2]. Agrammatic patients are mostly impaired in their syntactical and lexical abilities as explained by Berndt and Caramazza [3]: “...Such speech is composed chiefly of nouns, adjectives, and main verbs with relatively few pronouns, articles, prepositions, auxiliary verbs and conjunctions - the so called function words”. Hence most dysphasic patients suffer with problems such as word-finding, word-order, substitution and omission of
grammatical morphemes, sentence structure simplification, loss of sentential stress, sentence planning problems, and increased usage of overlearned phrases. The linguistic ability of patients varies considerably in different dysphasia categories on tasks such as free speech, naming, comprehension, repetition and linguistic judgement. In order to classify different varieties of dysphasia (i.e. Global, Wernicke’s, Broca’s, Transcortical Motor and Sensory, Conduction, Anomia, etc.), or to distinguish between normal and dysphasic performances, it is necessary to identify appropriate linguistic measurements which significantly differ across different groups used in the analysis.
The quantification of agrammatic deficit in stroke patients, most importantly in conversation provides important guidance for planning their speech and language therapies and for rehabilitating them. Until recently, research on analyzing conversation has been slow. Singh [4,5] has defined a new linguistic approach towards such a quantification. The method uses word-frequencies of lexical items as a measure of the subject’s performance for classifying moderate Broca’s aphasics and normal subjects. If we wish to examine whether different classes of subjects, for example male and female, left- and right-handed, patients and normal subjects, and so on, have dissimilarities in their spontaneous speech, then their performance data can be used with powerful classification technique
s to verify this assumption. When this data is gathered chronologically for the same patients, it can be valuable for determining their rate of recovery and modifying their therapy.
In this paper I propose to make use of word-frequency measurements. It has been noted by [6] that dysphasic patients’ language preserves Zipf’s law which states
that the product of an event’s (occurrence of a word) rank r and its frequency f (r) is a constant C. In other words, both normal and dysphasic speech must maintain an equilibrium: their speech cannot be too rich or too poor. Lexical richness depends on the total vocabulary of a subject and his weighted use of this vocabulary: the more the number of words used only once, twice and so on, the higher the richness factor. Hence, the weighted use of dysphasic and normal usage of language is an excellent way of comparing their performances. Howes(1964) also remarks that word-frequency measurements can be used for differentiating normal and dysphasic patients using a plot of their performance on two measurements: mean log frequency and mean square log frequency.
In this paper I seek to establish the use of word-frequency measurements for the classification of Broca’s aphasics and normal controls using neural networks. These measurements quantify subject performances on free speech. This preliminary study is important for the following reasons: (1) It est
ablishes the differences between Broca’s dysphasics and normal controls on a set of linguistic measurements; (2) It identifies patients who have recovered considerably and are now misclassified as normal subjects; (3) It confirms that linguistic measures proposed for this study can reliably differentiate between patient performances, and therefore they can be used in further studies in conjunction with neural networks as for quantifying recovery and differentiating between fluent/ non-fluent patients and those belonging to different categories based on localizationist theories; (4) The neural network weights at the end of the analysis can identify which variables served as more important discriminators, e.g. most of the therapy in clinics is directed towards improving the availability of
open class lexical items such as nouns, whereas it has been confirmed that nouns are the least important discriminators between dysphasic and normal free speech [7].
最酷铃声
1.1Linguistic Measures of Conversational Performance
It has been observed in aphasia literature that very fine grained analysis of linguistic items, or a pragmatic analysis of the interviewer-speaker interaction is both difficult to do and not so useful for therapeutic purposes. Singh [7] has described a set of 8 linguistic measures, v = (Noun-rate, Pronou
n-rate, Adjective-rate, Verb-rate, Type Token Ratio, Clause like semantic units-rate, Brunet’s index W and Honore’s Statistic R). These measures were chosen from a large number of linguistic features after careful experimentation on a set of 100 subjects [7] and did not correlate significantly with the amount of text-length which was consistently kept between 1000 and 1300 words. The first four measures quantify the rate of occurrence per 100 words of lexical items nouns, pronouns, adjectives and verbs respectively. TTR (type-token-ratio) is the ratio of the subject’s vocabulary and their speech sample text-length. Generally high TTR values represent a rich use of vocabulary, in linguistic terms, and patients suffering with word-finding problems have typically low values on this measure. C-rate (clause-like-semantic unit rate) is a measure for quantifying the subject’s ability to form phrases and a good low score implies that the subject can form longer phrases.
A set of rules were devised about marking CSU units on transcribed data [7]. W (Brunét’s Index) [8] and R (Honoré’s Statistic) [9] measures have been traditionally employed for authorship attribution studies for written manuscripts, [10,11]. They are given by:
W=N v−0165
.
(1)
sorry sorry superR=100*log(N)/(1 - V1/V)(2)
Here N is the overall text-length, V1 is the number of words used only once and V is the total vocabulary. All of these variables depend on word-frequency. N-rate, P-rate, V-rate, C-rate and W typically have scores between 10 and 25, A-rate between 3 and 10 and R between 1000 and 2000. In the majority of cases, patients have poor scores on these variables when compared to normal subjects, the actual differences dependent on their degree of lexical deficit.
1.2.Pattern Classification Methods
There are several methods of classifying between multivariate data. Linear Discriminant Analysis (LDA) is one of the popular methods used in medical research. Principal Components Analysis (PCA) can be used in addition to identify independent variables which contribute most to the variance in the original data-set. Artificial Neural Networks (ANN) have been comparatively less used in speech and language research. In later sections I shall present the classification results using LDA and ANN techniques and I shall argue that ANN techniques are far superior in both quantitative and qualitative terms.
Classifying subjects’ speech using linguistic variables can be helpful in many ways. If the classification results are good, then it will confirm the similarity in language behaviour across subjects within a putative group. These similarities and the weighted role of individual measures which are important for such a classification can be then identified. Knowledge about similarities and differences in language behaviour across different subjects is important not only to speech and language