Preliminary grounds from perceptual experiments and earlier surveies show that dysarthric talkers have a lower speech production rate and longer vowels and consonants than healthy talkers. This paper reports the consequences of acoustic probe based on rhythmic categorizations of address from continuance measurings carried out to separate dysarthric address from healthy address. The experiment takes into consideration beat prosodies, which are based on durational features of consonantal and vocalic intervals and the Pairwise Variability Index ( PVI ) every bit good as the Nemours database of American dysarthric talkers. Results show that when compared to a standard address, the acoustic steps performed on disturbed address can efficaciously be used to qualify the dysarthric talkers and even the badness of dysarthria.
Index terms- Dysarthria, beat, Pairwise Variability Index, acoustical analysis, Nemours database
Dysarthria covers assorted address upsets ensuing from neurological upsets and it likely represents a important proportion of all acquired neurological communicating upsets. These upsets are linked to the perturbation of encephalon and nervus stimulations of the musculuss involved in the production of address. Such upsets rei¬‚ect perturbations in the strength, velocity, scope, tone, steadiness, timing, or truth of motions necessary for prosodically normal, efficient and apprehensible address. In fact, these upsets induce variable address amplitude and hapless articulation. All types of dysarthria affect the articulation of consonants taking to slurring address. Vowels may every bit good be distorted in really terrible dysarthria. Rhythm problems may be the common feature of assorted types of dysarthria. Many surveies province that most dysarthric patients have slow speech production rates with long vowel and harmonic sections as compared to standard control samples [ 1, 2, 3, 4, 5, 6, and 7 ] .
The present paper focuses on appraisal of rhythmic perturbation in dysarthria caused by intellectual paralysis and caput injury. Cerebral paralysis refers to a assortment of developmental neuromuscular pathologies, happening in three chief signifiers: spastic, athetoid, and ataxic, associated with bilateral lesions of upper motor nerve cell pathways that innervate relevant cranial and spinal nervousnesss. In this paper, we have spastic, athetoid, and caput injury patients who present dysarthria address.
Dysarthria badness can be indexed in several ways, but quantitative steps normally focus on prosodic characteristics, chiefly intelligibility and speech production rate. Perturbation of beat in the address flow procedure is one of the of import factors in prosodic abnormalcies. Even if the beat is identified as the chief characteristic that characterizes dysarthria, assessment methods are chiefly based on perceptual rating steps. Despite their legion advantages that include easiness in usage, low cost and clinicians ‘ acquaintance with related processs, perceptual-based methods suffer a figure of insufficiencies and facets that affect their dependability. These methods besides lack rating protocols that may assist standardisation of judgements between clinicians and/or rating tools. Therefore, the purpose of this work is to quantify rhythm abnormalcies in the dysarthric address through the beat prosodies ( Ramus and Grab parametric quantities ) developed late particularly in linguistic communication designation sphere [ 8, 9 ] . Grab et al do non associate address beat to phonological units such as interstress intervals or syllable continuances. Alternatively, they calculate durational variableness in consecutive acoustic-phonetic intervals utilizing Pairwise Variability Indices ( PVI ) [ 10 ] . The natural Pairwise Variability Index ( rPVI ) is given in equation ( 1 ) :
( 1 )
Where dk is the length of the kth vocalic or intervocalic section and N the figure of sections. A normalized version of the PVI index ( noted nPVI ) is defined by:
( 2 )
Ramus and co-workers argued that a feasible history of address beat should non trust on complex and linguistic communication -dependent phonological constructs but on strictly phonic features of the address signal [ 8 ] . They measured vowel continuances and the continuance of intervals between vowels. Ramus et al computed three acoustic correlatives of beat from the measurings: ( a ) % V, the proportion of clip of vocalic intervals in the sentence ; ( B ) a?†V: the standard divergence of vocalic intervals ; ( degree Celsius ) a?†C: the standard divergence of inter-vowel intervals. Ramus et Al. argued that a combination of % V and a?†C provided the best acoustic correlative of beat categories [ 9 ] . Our end is to utilize these prosodies in order to separate between the healthy and dysarthria talkers and to qualify the intelligibility because the changes of beat may besides impact speech intelligibility [ 10, 11, 12 ] .
This paper is organized as follows: Section one is an debut to the work. Section 2 presents the address stuffs, topics and processs used throughout the experiment and gives an account of the different indices and steps proposed to qualify dysarthria. Section 3 trades with a quantitative acoustic analysis of dysarthric address. Section 4 is for the probabilistic categorization used to find which sets of rhythmic forecaster variables best discriminated between dysarthrias and control talkers. As for Section 5, it concludes the work and licenses to supply some positions.
2.1 speech stuff
Nemours is one of the few databases of recorded dysarthric address. It contains records of American patients enduring different types of dysarthria. The rating methodological analysis followed in Nemours is inspired by the work of Kent [ 13, 14 ] . Kent [ 1 ] nowadayss a method that starts by placing the grounds for the deficiency of intelligibility and so adapts the rehabilitation schemes. His trial consists of a list of words from which four words are selected. The patient is supposed to listen to these words and reiterate them out loud. The rating takes into history the phonic contrasts that can be disrupted.
The full set of stimuli consists of 74 monosyllabic names and 37 bi-syllabic verbs embedded in short bunk sentences ( two names and a verb per sentence ) . Speakers record 74 sentences with the first 37 sentences indiscriminately generated from the stimulation word list ( each talker deals with 74 sentences different from these of the other talkers ) and the other 37 sentences are obtained by trading the first and 2nd names of the 37 first sentences. Sentences have the undermentioned signifier:
THE noun 1 IS verb-ING THE noun 2.
A recording session conducted by a address diagnostician, in a little sound dampened room with the speaker seated ( typically in his wheel chair ) next to the address diagnostician or experimenter and in forepart of a tabular array mounted mike ( Electro-Voice RE55 ) connected to a digital sound tape recording equipment. ( Sony PCM-2500 ) . Nonsense sentences were written in big print on a sheet placed in forepart of the speaker and each sentence was read foremost by the experimenter and so repeated by the topic. This assisted all speakers in pronunciation of words and was indispensable for some topics with limited seeing or literacy. The address stuff is sampled at 16 kilohertzs and 16 spot sample declaration after low base on balls filtering at a nominal 7500 Hz cutoff frequence with a 90 dB/Octave filter. This achieved about 12 dubnium of fading at the Nyquist frequence [ 13, 14 ] . The extraction of acoustic information and derived parametric quantities is performed utilizing the Snack bundle of KTH [ 15 ] .
Merely eight sentences from 74 sentences for each dysarthric and healthy talker were segmented and labeled manually to phonemic intervals. The beginning and terminal of each phoneme within each phrase was marked by listening to the wave form and utilizing the strength envelope as a usher.
The talkers are eleven immature grownup males with dysarthria due to intellectual paralysis ( CP ) or caput injury ( HT ) and one non-dysarthric grownup male ( the experimenter ) . Seven talkers have intellectual paralysiss, among whom three have CP with spastic quadriplegia and two have athetoid CP, and both have a mixture of spastic and athetoid CP with quadriplegia. The four staying topics are victims of caput injury ( one quadriplegic and one with spastic quadriparesis ) , with cognitive map runing between Level VI-VII on the Rancho Scale. The address from one of the speakers ( caput injury, quadriplegic KS ) was highly unintelligible. A two-letter codification was assigned to each patient: BB, BK, BV, FB, JF, KS, LL, MH, RK, RL and SC.
The Frenchay dysarthria appraisal tonss ( see table 1 ) of motor map associated with each talker revealed that the patients were extremely heterogenous and consisted of three subgroups, one mild, including topics FB, BB, MH and LL. The 2nd, between the first and the 3rd subgroups, includes the topics RK, RL, and JF. The 3rd is terrible and includes topics KS, SC, BV, and BK. The perceptual information and the address appraisal did non take into consideration the excessively terrible instance ( patient KS ) and the excessively mild instance ( patient FB ) [ 13, 114 ] .
2.3. The Rhythm Prosodies
For each dysarthric sentence of each talker, we have measured the continuances of the vocalic, consonantal, voiced and voiceless sections. This allows us to calculate the follows parametric quantities:
% V: Percentage of sentence continuance of vocalic intervals.
District of columbia: Standard divergence of consonantal intervals
DV: Standard divergence of vocalic intervals
% VS: Percentage of sentence continuance of sonant intervals
D ( VS ) : Standard divergence of sonant intervals
D ( VNS ) : Standard divergence of non voiced intervals
Vocalic -rPVI: Natural pairwise variableness index for vocalic intervals
Vocalic -nPVI: Normalized pairwise variableness index for vocalic intervals
Intervocalic-rPVI: Natural pairwise variableness index for intervocalic intervals
Intervocalic-nPVI: Normalized pairwise variableness index for intervocalic intervals
Table 1. The beat prosodies parametric quantities
per centum of severety
3. RESULTS AND DISCUSSIONS
The mean and the standard divergence for the continuance of vocalic and consonantal intervals are given in Table 1. The consequences confirm clearly that the continuances of both intervals are greater for Dysarthric patients ( DP ) than the Healthy Control ( HC ) .
Table 2: . Mean and standard divergence of consonantal and vocalic interval continuances.
Figure 1 shows the distribution of DP and HC along the % V ( X axis ) and DC ( Y axis ) dimensions. The figure illustrates how the proportion of vocalic intervals represents less than 30 % of the entire continuance of a sentence for the terrible instances ( KS, BK, SC, RL ) which is to be expected because a vowel that is spoken less clearly tends to be reduced while less terrible instances are much closer to the control subjects. The program ( % V, DV ) shows about the same distribution of the DP patients ( KS, BK, SC, RL are comparatively stray from the remainder of the talkers ) . Finally, we observed a high DC, DV and a low % V for the DP, peculiarly for the terrible instances ( KS, BK, SC, RL ) .
Duration ( millisecond )
Figure 1: Distribution of DP ( Dysarthric Patient ) and HC ( Healthy Control ) along the % V ( X axis ) and, ? ( V ) ( Y axis ) dimensions
A one-way analysis of discrepancy ANOVA to find if the prosodies demonstrated important group differences was conducted. For DC and % V, the chief consequence of group was non statistically important ( F ( 1,20 ) =3.04, P = 0.09, and F ( 1,20 ) =0.91 p=0.35 severally ) but for the standard divergence of vocalic intervals, the chief consequence of group was instead important ( F ( 1,20 ) = 5.35, p=0.03 ) .
Actually, we can detect in Figures 2 and 3 that about all DP are endowed by a Vocalic-nPVI that is inferior to that of the HC but with a higher Vocalic-rPVI. However, for the Intervocalic-rPVI ( even Intervocalic-nPVI ) , the DP and HC were similar except for the most terrible Displaced person: KS, RK and BK.
The chief consequence of group for Vocalic-rPVI and Vocalic-nPVI was statistically important ( F ( 1,20 ) =5.93, p=0.02, and ( F ( 1,20 ) =10.6, p=0.004 severally ) . The chief consequence of group for the Intervocalic-rPVI and Intervocalic-nPVI was non statistically important ( ( F ( 1,20 ) =3.58, p=0.07, and F ( 1,20 ) =0.058 p=0.81 severally ) .
We can detect that BV who is considered terrible instance is ever near to HC and mild DP. FB the mildest was comparatively far to HC.
Figure 2: Dysarthric and healthy topics ( DP and HC severally ) represented in the ( nPVI, inter-rPVI ) infinite for the Vocalic and Intervocalic intervals
Figure 3: rPVI index representation of healthy and dysarthric subjets
A sum of 74 sentences for each talker ( 814 vocalizations by dysarthric talkers and the same figure of vocalizations by the healthy control ) were automatically segmented to voiced and unvoiced intervals. The obtained consequences illustrated by table 4 reveal that the terrible dysarthric patients tend to bring forth elongated voiceless sections with higher values of standard divergence and continuance of sonant intervals greater than voiceless intervals and a greater figure of both sections. The continuance of sentences repeated by the patient KS ( KS the most sever DP ) was far superior to other DP, whose continuance was besides superior to the healthy control. KS had the greatest figure and the most elongated voiced and unvoiced sections.
Table 3: . The mean and standard divergence computed for sonant and unvoiced intervals continuance.
Mean ( sec )
We can detect from figure 4 a random distribution of the DP when the HC are good regrouped. The most terrible instances are the most far DP from HC ( KS, RK, , RL, and SC ) with the lowest values of % DVs and the most mild instances ( FB and BV ) with the highest values of % DVs. The remainder of DP are close to HC. We note that the DP BV, whose Frenchay and intelligibility tonss are 57.5 and 3 severally, is with all the parametric quantities studied, yet is really near to the HC. Indeed, on analyzing the address of BV and FB patient, we noted that the velocity of BV address was rather normal and about apprehensible but with nasality ( BK is most terrible in spite his Frenchay and intelligibility tonss are 58.2 and 3 severally ) . FB is the mildest instance but he is non the closest Displaced person to. In fact his address is really apprehensible but his address rate is really slow.
Figure 4: Dysarthric and healthy topics represented in the ( % DVs, ? ( NVs ) infinite.
4. Categorization SYSTEM
We use a additive discriminant analysis, which is a Gaussian categorization technique based on a shared covariance across the categories in the plane of two forecaster parametric quantities. In our instance, we will see: ( % V, ?C ) , ( % V, ?V ) , ( Vocalic-nPVI, Intervocalic-nPVI ) , ( Vocalic-rPVI, Intervocalic-rPVI ) ) . In order to hold an thought of the power separation of the parametric quantities studied above, a closed trial that involves preparation and proving on the same information was adopted. Table 5 shows that the plane of ( % V, ?V ) gives the best separation mark ( 95.45 % ) of the categories ( DP and HC ) but besides the rPVI whose ( 86.36 % ) right separation of the dysarthric patients from healthy control is a really encouraging mark in malice of the closed trial.
Table4: the confusion matrices of all parametric quantities
% V, ? ( V )
% V, ? ( degree Celsius )
5. Decision AND Position
In this paper, we have tried to show acoustic grounds for rhythm-based appraisal of dysarthric address. The beat prosodies are based on durational features of vocalic and intervocalic intervals and their PVI utilizing both natural and normalized steps. The steps are so used to qualify the type and badness of dysarthria out of assorted continuance measurings. The timing of sonant sections and voiceless sections clearly show differences between healthy and dysarthric topics. We have computed acoustic variableness indices and we have found that these indices efficaciously express the badness degree of the dysarthria perturbation. Therefore, these characteristics might be really utile when included in package tools that can assist for diagnosing and preparation of dysarthric topics. Our attack in this work focused on the beat of dysarthric address by sing the proportion of vocalic, consonantal, voiced and unvoiced intervals in address vocalizations ( sentences ) , but the beat can besides be considered as a hierarchal organisation of temporally coordinated prosodic units. This will take us to analyze in future work the impact of prosodic characteristics and to go on to measure the vowel and harmonic sections in the context of disturbed speech.for the categorization, we will formalize our consequences utilizing unfastened trial with a big information set.