-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.html
3728 lines (3705 loc) · 199 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Autonomus Guitar Effects Activation Platform</title>
</head>
function RecurseDirs ()
{
oldIFS=$IFS
IFS=$'\n'
for f in "$@"
do
# YOUR CODE HERE!
prettier --write .
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
cd ..
fi
done
IFS=$oldIFS
}
RecurseDirs "./"
<body>
<a src="./directory.html">NAVIGATION</a>
<h1 align="center">Triggered Guitar Effects Platform</h1>
<p> </p>
<p>DTW Audio Sub Sequence Matching for Autonomus Audio Control Actions</p>
<p> </p>
<p> </p>
<video tabindex="-1" class="video-stream html5-main-video" webkit-playsinline="" playsinline=""
controlslist="nodownload" style="width: 383px; height: 215px; left: 0px; top: 36px;"
src="https://r3---sn-ab5sznlk.c.drive.google.com/videoplayback?expire=1615855768&ei=WMhPYKj6MIX4hgbrlpGYCg&ip=173.70.97.51&cp=QVRGV0lfVFdUSVhPOkc0NUVmYTkzanJhVnd2XzUzQ1JmR0RjbEtaZlhad0V5dVo2VXBENXRqazc&id=687e1db58ddc11f3&itag=18&source=webdrive&requiressl=yes&mh=XG&mm=32&mn=sn-ab5sznlk&ms=su&mv=m&mvi=3&pl=17&sc=yes&ttl=transient&susc=dr&driveid=1xvDbJUrsZiOqP90orbz2YzFZ8HaGTQm_&app=explorer&mime=video/mp4&vprv=1&prv=1&dur=30.789&lmt=1512330408021036&mt=1615840829&sparams=expire,ei,ip,cp,id,itag,source,requiressl,ttl,susc,driveid,app,mime,vprv,prv,dur,lmt&sig=AOq0QJ8wRgIhAIel-iFXlLdcQyUMLzx9uboZdMaIPrlyaD_TOxFu4a5SAiEAyHNc3Vis2LHwIgG-N4DI1TefKSduRfLMErKBL3gp4iM=&lsparams=mh,mm,mn,ms,mv,mvi,pl,sc&lsig=AG3C_xAwRgIhAJ24whyJyxmqvcye5IVbEsVVpqv906Yt78_DrRJC-tLlAiEAkU5MRkH9BLZifnlFqI6X4iPLDTaM70iiLbeQFpCzDQE=&cpn=LayGSmmIdBXry8qO&c=WEB_EMBEDDED_PLAYER&cver=1.20210310.3.0"></video>
<iframe src="SR Project II Presentation.pdf" height="1000px" width="100%" scrolling="yes" frameborder="yes"
allowtransparency="true" allowfullscreen="true"
sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals"></iframe>
<iframe src="Triggered Guitar Effects Platform Design Review.pdf" height="1000px" width="100%" scrolling="yes"
frameborder="yes" allowtransparency="true" allowfullscreen="true"
sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals"></iframe>
<iframe src="sudo apt-get install python-numpy" height="1000px" width="100%" scrolling="yes" frameborder="yes"
allowtransparency="true" allowfullscreen="true"
sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals"></iframe>
<h2>Team Members:</h2>>
<h2>Bryan Guner</p>
<h2>Haley Scott</p>
<h2>Ralph Quinto</p>
<p> </p>
<p>Primary Advisor: Dr. Ambrose Adegbege</p>
<p>May, 2018</p>
<p> </p>
<p>Acknowledgements</p>
<p>
The team would like to thank Dr. Ambrose Adegbege for his input and enthusiasm towards tackling our
design
challenges. We would also like thank Dr. Larry Pearlstein for his guidance and suggestion of Dynamic
Time
Warping.
In addition, our team thanks Mihir Beg for his consultation on MIDI transcription and suggestion of
Pure
Data.
</p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<hr>
<h2>Abstract</h2>
<p>
<b> In live performance, guitar effect pedals are a versatile yet limiting asset. They require
presence of
mind on the
part of the performer, and restrict the performer to the area of the stage in which the pedal
board is
located.
These constraints limit the performance quality and stage presence by splitting the performers
focus. This
project
proposes an automatic solution to the restrictions that guitar effect pedals present. The
performer will
record the
primary performances into the proposed software, which will analyze and store the sequential
frequencies.
The
performer will then utilize the software during a subsequent live performance, to trigger
effects when
preceding
frequencies of the live performance are recognized against the first performance. This platform
is
realized through
the use of Pure Data, a GUI (Graphical User Interface) for audio manipulation applications. Our
team has
designed an
application that implements dynamic time warping (DTW) in order to compare the first
performances against
the live
performance. The system compares MIDI data using the dynamic time warping distance threshold,
opposed to
the
Euclidean distance threshold, making it a robust approach to mitigating live performance
error.</b>
</p>
<p> </p>
<p> </p>
<p>
<strong>Keywords: Pure Data, Dynamic Time Warping</strong>
</p>
<hr>
<p> </p>
<h1>Contents</h1>
<p>Abstract........2</p>
<p>Introduction......5</p>
<p>Specifications...........7</p>
<p>1.0 Chapter 1: Background......8</p>
<p>1.1 How the Electric Guitar Works...........8</p>
<p>1.2 How the Pedal Board Works...................8</p>
<p>1.3 Guitar Signal Analysis............9</p>
<p>1.4 Why Transcription Might Not Be Important............10</p>
<p>2.0 Chapter 2: The Counting Method...12</p>
<p>2.1 Concept.....12</p>
<p>2.1.1 Primary Recording......12</p>
<p>2.1.2 Primary Analysis.....13</p>
<p>2.1.3 Live Performance........13</p>
<p>2.1.4 Live Analysis.......13</p>
<p>2.1.5 Method Expansion...14</p>
<p>2.2 Pure Data Effects...........14</p>
<p>2.2.1 Digital Effect Design in Pure Data..14</p>
<p>2.2.2 Delay Effect in Pure Data........15</p>
<p>2.2.3 Fuzz Effect in Pure Data..15</p>
<p>2.2.4 Reverb Effect in Pure Data..........16</p>
<p>2.2.5 Spectral Delay Effect in Pure Data..17</p>
<p>3.0 Chapter 3: Dynamic Time Warping Method19</p>
<p>3.1. Dynamic Time Warping19</p>
<p>3.2 Why Dynamic Time Warping is a good choice for effects triggering...20</p>
<p>4.0 Chapter 4: Pure Data and Method Implementation.22</p>
<p>4.1 Introduction....22</p>
<p>= 4.2 First Approach....22</p>
<p>4.3 Python Implementation...23</p>
<p>4.4 Pure Data Design.....23</p>
<p>4.5 DTW Implementation of Java......26</p>
<p>4.6 DTW in Pure Data............29</p>
<p>5.0 Chapter 5: Testing and Validation Execution.........33</p>
<p>5.1 Methods of Functionality Testing.........33</p>
<p>5.2 Methods of Specification Testing.....35</p>
<p>5.3 Realistic Constraints Testing....36</p>
<p>6.0 Chapter 6: Conclusion....37</p>
<p>6.1 Core Intent........37</p>
<p>6.2 Results Achieved..37</p>
<p>6.3 Expectations and Modifications.......38</p>
<p> </p>
<p>Appendices:</p>
<p>Appendix A: Project Overview</p>
<p>Appendix B: Project Management</p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p>Nomenclature</p>
<p>
<strong>Primary performances</strong>
: Guitar signal data recorded at home or in studio, before the live
performance
</p>
<p>Introduction</p>
<p>
A musicians desire to deliver inspired live performances would derive great utility from automatic
effect
triggering. This concept will allow users to focus on their performance rather than the management
of their
sound
effects. The value of the automatic triggering arises not only in convenience, but also in economic
expense.
Guitar
effect pedals are far more costly than digital effects, as there are significantly less points of
failure in
digital
effects. This idea serves to minimize distractions and cost to the performer, while maximizing the
experience of
live performance for the audience. This task will be achieved through the methods of dynamic time
warping,
and
frequency counting.
</p>
<p>
Dynamic time warping provides a stronger solution for the issue of performance error. This algorithm
uses a
comparison method to find the most optimal correlation between live performance and pre-recorded
performance. Here,
the performer would record four primary performances. The dynamic time warping algorithm is applied
using
the first
performance against each subsequent performance, producing a sum of distances between these three
sets of
recordings. This sum is then padded with a tolerance, and used to compare with the sum of distances
between
the
fifth (live) performance and the first primary performance. This method is applied to smaller
sections of
the
performance, providing the knowledge of when to trigger the desired effect. The implementation of
this
algorithm
proves to be highly tolerant to error, but fairly complex.
</p>
<p>
The alternate approach is the implementation of a frequency counting algorithm. Here, the user will
only
record one
primary performance. This performance will be used to create a sequence of frequencies, with a given
tolerance. The
number of changes in frequency are counted and stored. The frequencies from the live performance are
then
verified
against the pre-recorded frequencies. The number of changes in frequency are stored and checked
against the
pre-recorded sum of changes. Once the separate sums from each performance are equal, the effect will
be
triggered.
This method proves to be extremely straightforward, yet offers few solutions to the problem of live
performance
error.
</p>
<p>
Throughout this report, our team presents the methods instated in our current project, as well as
the next
steps in
the design and modification process. A more in-depth description of the two aforementioned methods
is given,
including both the pros and cons of each.
</p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p>Specifications</p>
<p>
The project was designed to make integration as seamless as possible with current guitar setups.
Minimizing
the
delay between the performance and effect triggering is crucial so that musicians are not thrown off
by the
offset.
Based on the research that our team has completed, the threshold of our system was set to have a
maximum
trigger
latency of 1 second. Anything longer would be too disruptive to the performance. Another crucial
specification is
having a sampling rate twice as large as the fundamental frequencies of the guitar. For electric
guitars,
frequencies range from 80 Hz to nearly 400Hz. The software utilized in this project has a default
sampling
rate of
44,100 Hz, which is more than enough to satisfy the Nyquist criteria. In addition, having many
guitar
effects offers
great versatility to a performance. The group has set a goal to implement three different guitar
effects
along with
the completion of our system.
</p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p>1.0 Chapter 1: Background</p>
<p>1.1 How The Electric Guitar Works</p>
<p>
An electric guitar is a stringed instrument that uses a device called a pickup to electrify and
subsequently
amplify the guitar signal. The majority of guitar pickups are permanent magnets wrapped with a coil
of
insulated
wire. This means the pickup in most cases is a passive element that generates a voltage via
electromagnetic
induction. Most pickups have greater sensitivity to higher frequency notes. This is because with
higher
frequency
comes higher string velocity, which is in proportion with the output of the pickup. An engineer who
is not
musically
inclined can think of the pickup as a transducer functioning as an inductor.
</p>
<p>1.2 How The Pedal Board Works</p>
<p>
A pedal board is usually comprised of a power source and a mechanical housing for what are called
stomp
boxes. Also
known as effect pedals, stompboxes are activated by the guitarist manually during a performance and
proceed
to alter
the content of the guitar signal in some form. Some may be as simple as an EQ sweep that produce the
famous
wah tone
popularized by artists such as Jimi Hendrix and Slash, while other effects like delays and
harmonizers are
more
complex and contain multiple internal stages of effect processing. A guitarists pedal board usually
houses
anywhere
from three to twenty effects pedals, and can contain tens of thousands of dollars worth of gear.
Conventionally, the
guitar is run through the effects pedals and then into an amplifier, however higher quality
amplifiers may
provide
an effects loop that allows for different configurations of pre and post effect amplification. While
this
method is
effective enough to be the go to setup for practically all gigging guitarists, there are some
fundamental
limitations to this configuration. The pedal board is usually power hungry, expensive, and
stationary.
</p>
<p>1.3 Guitar Signal Analysis</p>
<p>
In order to accurately trigger effects based on an audio signal, one must accurately catalog the
content of
said
signal. There are many approaches used in the analysis of guitar signals, but in order to make sense
of
them, first
one must understand what is is they are measuring. A musical signal can be broken into different
musical
feature
representations. To simplify, one can assume there are only four features contained in any signal.
The
guitar signal
will contain dynamic features, that is, the volume of different notes and sections of the music in
relation
to the
others. It will also contain pitch information, which reveals the frequency or the notes being
played at any
given
time. Further, one may examine the timbre, which relates to the variation from ideal integer
multiples of
the
fundamental frequency that the harmonics of the note occupy, as well as the energy contained within
them.
Finally,
one may record the tempo or speed that the performance and subsections of it were played at. There
are
countless
analysis techniques for each musical feature and various schemes to combine and compare them. Below,
we will
describe a few of the more robust analyses, however, note that our current approach uses a Pure Data
object
named
Fiddle that roughly approximates the guitar signal into MIDI data which only records pitch and
tempo. In
terms of
dynamics, one may take the mean squared amplitude of segments of a sound file. Alternatively, one
may try to
remove
the issue of variable note amplitude entirely by classifying chords in order to mask out the notes
hypothesized to
be in a given sound segment by a chord template bit mask. In terms of pitch, there are nearly
countless
algorithms
because this is the most important feature to accurately transcribe music. Almost all of these
approaches
fall into
either time domain analysis (which are usually some variation of autocorrelation) or frequency
domain
analysis
(which is often performed on the results of a FFT (Fast Fourier Transform)). Some notable variations
are
SNAC
Autocorrelation, (utilized by the Pd object helmholtz) or Chroma-vector or pitch class profile
transcription
(bit
mask approximations of the performance based on predefined FFT spectrum content of various chords)
or
Cepstrum
(which is the magnitude spectrum of the log of the magnitude spectrum of a window of the audio
signal which
can be
used to determine the frequency of its peaks). Timbre has analysis techniques similar to Spectral
Centroid,
however,
it does not matter much in the context of transcription because the only information it may
extrapolate is
what
instrument the piece was played on. In terms of tempo, the most promising approach to accurate
transcription
is the
separation of the signal into transient and steady state sections.The transient sections would then
signify
note
onset and therefore yield tempo. This separation would be performed by looking at the variance of
phase
information
extrapolated from the frequency bins of the signals wavelet transformation in relation to the
previous few
bins.
While the approaches here are only a small fraction of all material on the subject, it turns out
that effect
triggering does not require accurate transcription of the audio file but rather consistent recording
and
comparison
from one performance to another.
</p>
<p>1.4 Why Transcription Might Not Be That Important</p>
<p>
Preliminary research on this project focused on accurate transcription techniques that could be used
to
create a
profile for each song that was so accurate that subsequent performances could be compared within
certain
tolerances
on a nearly note-for-note basis. Eventually, it became apparent that this methodology was fraught
with
pitfalls. The
most obvious is that these analysis techniques are complex and computationally expensive. The next
challenge
was
that even if the features could be accurately extrapolated, how could they be combined in real time
for a
comparison? It became apparent that most techniques did not account for large differences in tempo
between
performances or even moments of silence in the guitar signal. While solutions for the integration of
some of
these
techniques were available or even obvious, it appeared that there would be many challenges to
overcome.
Further,
even if accurate transcription were achievable, would the comparison from one performance to another
be
reliable
with the introduction of human error? Taking into account all of the following hurdles, the most
economic
approach
would be to focus on the comparison of two recordings rather than deciphering the information
contained
within them.
This mentality on effect triggering has proven to be not only immeasurably more efficient, but also
more
robust. The
efficiency is gained through the simplistic spectral analysis employed, and the obsolescence of
combining
different
analyses. The strength of this approach is gained in the fact that focusing on a comparison
technique allows
for
much more variance in the timing of the performances of a given song. Further, if
transcription-centric
effect
triggering was at the forefront of consideration, a comparison technique would still be necessary,
and so
focusing
on comparison first allows for redesign and reconsideration later on. The only consideration this
methodology is
hinging upon is that less precise transcription techniques are consistently misrepresenting the
information
in the
audio signal the same way every time.
</p>
<p> </p>
<p> </p>
<p>2.0 Chapter 2: The Counting Method</p>
<p>2.1 Concept</p>
<p>
The initial material researched by each team member contributed greatly to the original concepts of
the
design
plan. The team studied various topics that relate to this project, including speech processing
techniques,
gesture
recognition techniques, and audio transcription techniques. This material allowed our team to define
the
design
requirements and functionalities of our proposed system. The concept involves a pre-recorded
performance,
which is
to be compared to the live performance. Through the user interface, the performer will mark the
first on
trigger
point, and subsequently the first off trigger point for the effect. With the knowledge of which part
of the
song the
performer is playing, the device will be able to trigger effects automatically during the live
performance.
</p>
<p>2.1.1 Primary Recording</p>
<p>
In order to compare a live performance, the device would need to be able to accurately count and
threshold a
pre-recorded performance. Our team proceeded by developing an algorithm to accurately and precisely
map the
pre-recorded performance. After completing research on windowing and frequency bins, we agreed that
a
straightforward approach would be applied. The guitar signal would enter the computer as a bit
stream, and
would be
analyzed using a windowed Fast Fourier Transform (FFT). This data describes the song in a sequential
frequency
representation. Essentially, this information defines the song in terms of the notes being played,
as each
note
represents a unique frequency.
</p>
<p> </p>
<p>2.1.2 Primary Analysis</p>
<p>
In order to make sense of this recorded performance map, our team conceived the idea of an algorithm
to
recognize
the change in frequencies, and record them. The system would only register a change in frequency
should the
change
exceed the instated tolerance. The system would record the first value that exceeds this tolerance,
subsequently
producing a list of notes in the order they are played. The user would manually select the first on
trigger
point by
entering the numeric value of how many notes preceed the trigger point. This system would trigger an
effect
if and
only if the number of the specified note entered by the user is equal to the number of notes
recognized in
the live
performance.
</p>
<p>2.1.3 Live Recording</p>
<p>
The live performance presents an interesting obstacle for this implementation. Not only do the exact
number
of
notes need to be monitored, but also the accuracy of the notes. Here, the performer would play the
same song
that
was pre-recorded. The notes would again enter into the computer as a bit steam, and would be
analyzed in a
similar
manner as the primary recording. Here, the frequencies are sequentially recorded every time a change
greater
than
the previous notes tolerance occurs.
</p>
<p>2.1.4 Primary and Live Comparison</p>
<p>
The analysis of the live performance involves the comparison of the two recordings. If the initial
live
performance
frequency matches the initial recorded frequency within the given tolerance, a 1, or bang will be
recorded.
This
pattern continues, marking a 1 or bang every time a frequency outputs a match. These 1s recorded
from the
live
performance are compared to the sum of the number of frequencies before the trigger point marked by
the
user. When
these two numbers are equal, the effect will trigger.
</p>
<p>2.1.5 Method Expansion</p>
<p>
This process can be expanded to account for the off trigger points, as well as multiple effects
throughout
the
song. The same concept would be used to mark different points in the primary recording, instead of
simply
the
initial on trigger. The main drawback of this method is the lack of tolerance to error. This problem
could
be
addressed with a simple solution, one would first need to modify the Pd patch to have subpatches for
all 12
tempered
scale notes instead of just one of them. If one note is missed, but the next three are correct, the
system
will fill
in a 1 or bang in place of the missed note.
</p>
<p>2.2 Pure Data Effects</p>
<p>
Pure Data (Pd) is the software platform our team utilized to design a simplified counting method
described
in
section 2.1. Pd is a visual programming language and environment for creating interactive computer
and
multimedia
works. Its most promising attributes are its visual simplicity and its abilities to process audio in
real
time.
</p>
<p>2.2.1 Digital Effect Design in Pure Data</p>
<p>
The visual interface provided by pure data makes the creation of digital effects relatively simple.
Basic
effects
can be designed solely through the use of object blocks. Object blocks can contain various controls,
such as
ADC,
DAC, read signal, write signal, or even simple numeric inputs. Using these controls, our team
created three
different digital effects.
</p>
<p>2.2.2 Delay Effect in Pure Data</p>
<p>
Our team designed a guitar delay effect to play back the delayed signal into the recording. This
effect
creates the
sound of a repeating, decaying echo. First, the guitar signal enters an ADC object, and continues to
a delay
write
block which allocates memory for a delay line and writes the guitar signal into it. The signal is
then
attenuated by
a factor of .6 in order to create the delayed sound. Next, the signal is read from the delay line
using a
delay read
block. Finally, the delayed guitar signal passes through a DAC object, which allows for the
production of
the audio
output.
</p>
<p>
<img width="334" height="214" src="Final%20Report%20SP2_files/image002.gif" alt="image002">
</p>
<p>
<i>Figure 1: Delay Effect Pd</i>
</p>
<p>2.2.3 Fuzz Effect in Pure Data</p>
<p>
Our team designed a guitar fuzz effect to implement a distortion effect on the guitar signal. This
effect
creates
the sound of a more processed or artificial guitar. Here, the guitar signal is passed through an ADC
object,
and
continues on to a gain to be multiplied by a factor of 40. Next, the signal is passed through a clip
object,
which
restricts the signal to lie between two limits, -0.5 and .05. The signal is finally converted back
to an
analog
signal through the DAC object.
</p>
<p>
<h3>Demonstrated here!</h3>
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/krRVGoK9NcA"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
<img width="344" height="184" src="Final%20Report%20SP2_files/image003.gif" alt="image003">
</p>
<p>
<i>Figure 2: Fuzz Effect Pd</i>
</p>
<p>2.2.4 Reverb Effect in Pure Data</p>
<p>
Our team designed a reverb guitar effect, which mirrors a large number of reflections of the signal
to build
up and
then decay. The guitar signal is first sent through an ADC object, and subsequently sent to a
freeverb~
block which
uses the Schroeder and Moorer model for reverb. The guitar signal is finally sent to both outputs of
the DAC
object,
which allows for the production of the reverberated sound. There are many controls associated with
the
reverb that
allow users to manipulate the sound of the effect. The roomsize block allows the user to slide from
a range
of 0 to
1, creating a smaller or larger simulated room size. The damping block slides from 0 to 1, affecting
the
dampness of
the objects within the simulated room. The wet to dry slider controls the amount of reverb applied
to the
dry, or
untouched signal. The freeze toggle allows the user to grab the real time tail of the reverb and
apply the
sound
continuously. The bypass toggle allows the user to turn the effect off, and pass through a
completely dry
guitar
signal. These toggles serve as a convenient aid throughout the testing process.
</p>
<p>
<img width="366" height="274" src="Final%20Report%20SP2_files/image004.gif" alt="image004">
</p>
<p>
<i>Figure 3: Reverb Effect Pd</i>
</p>
<p>2.2.5 Spectral Delay Effect in Pure Data</p>
<p>
Our team designed a spectral delay effect to scatter the original guitar sound, allowing one to hear
all the
partials (harmonics) ringing at different times. This effect creates the sound of a repeated, echoed
guitar
signal.
In this patch, the fft of the incoming guitar signal is calculated and used to cut the sound into
very thin
frequency bands. A different, user controlled delay is then applied to each of these bands before
resynthesis. If
the length of the delay lines vary greatly from one frequency band to another, the ringing effect of
each
harmonic
becomes more apparent. Here, the guitar signal is passed through an ADC object, and continues on to
a Pd
block
containing user controlled sliders. Each slider allows the user to specify the amount of delay,
feedback,
and gain
instated. The user is also able to control the cutoff of the low pass filter applied, as well as how
wet or
dry the
signal should be. Next, the signal is passed through an equal power crossfade object, which allows
the
overall
volume to be maintained through the crossfade. The signal is finally converted back to an analog
signal
through the
DAC object.
</p>
<p>
<img width="406" height="315" src="Final%20Report%20SP2_files/image005.gif" alt="image005">
</p>
<p>
<i>Figure 4: Spectral Delay Effect Pd</i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>
<i> </i>
</p>
<p>3.0 Chapter 3: Dynamic Time Warping</p>
<p>
3.1 Dynamic Time Warping (DTW) Dynamic Time Warping is an algorithm for the measurement of
similarity
between two
temporal sequences, which may vary in speed. The algorithm calculates an optimal match between two
given
sequences
in the form of a distance that is the sum of localized cost functions. The process can be thought of
conceptually as
arranging the two sequences on the sides of a grid. Each cell within the grid will be filled in with
a
distance
measure comparing the corresponding elements of the two sequences. This measure can be just
subtraction of
one
sequence from another, but more often each cell is computed by more symmetric measures like the
square of
the
difference. In order to find the best path through the grid, we search for a path that minimizes the
total
distance
between them. The procedure for finding an overall distance measure is to find all the possible
routes
through the
grid and calculate the total distance for each. Note that because we are looking for a minimum
length path,
each
routes total distance is the minimum of the sum of distances between individual elements on the path
divided
by the
sum of the warping function. Most instances of the DTW algorithm share common optimizations. The
most
obvious is
known as the monotonic condition, which simply stated is the rule that the path will not turn back
on itself
and
therefore the indexes of the matrix must either remain the same or increase during each subsequent
iteration.
Further, examining the literal definition of a path, the elements selected must to some degree
border each
other,
and so the indexes may only increase by 1 from each element of the path to the next. Another
commonality in
DTW
algorithms is the boundary condition, which requires that the path must begin at the intersection of
each
sequences
respective first elements and end on the intersection of each sequences respective last elements.
Some other
optimizations are common but not integral, for example, the adjustment window condition postulates
optimizations to
keep the path from wandering too far from a diagonal through the grid, which while sometimes a
critical
flaw,
usually results in huge computational efficiency gains. Another such optimization is a slope
constraint
which
ensures that particularly long sequences arent matched with short sequences, but this can result in
inaccuracies if
one of the signals was particularly dilated or contracted in time. Since this technique has been at
the
forefront of
comparing temporal signals since the 1980s, the list of variations on the classic algorithm is
extensive.
3.2 Why
Dynamic Time Warping is a Good Choice for Effect Triggering In the more simplistic counting approach
employed, the
guitarist is required to replicate the performance both faithfully in time and in accuracy. Small
inaccuracies in
either metric could easily cause the triggering scheme to miscount out of our predetermined range.
DTW is a
suitable
alternative because it is relatively insensitive to time-scale contraction or dilation in either the
database or
query signals. Further, even if the performer makes numerous mistakes in the performance, as long as
the
section is
the closest match to the database sequence the program will consider it a match. This robust
algorithm can
be proven
time in and time out by any speech dictation software where the tolerance for difference between the
two
signals is
astounding. Moreover, when considering different musical feature measures, the methods of comparison
differ
depending on the musical feature with some necessitating convoluted applications such as neural
networks.
Even more
demanding is the need to combine different features in a useful way. Dynamic Time Warping works as
an
accurate
comparison between features so long as they are structured in time (even if the information they
convey is
in
frequency). In addition, because the features considered for a DTW algorithm are set in time, if one
were to
choose
to track multiple features concurrently, (no longer necessary in the context of DTW) it would likely
be
simple to
correlate detected matches between different features by a simple timing threshold. Following this
train of
thought,
one could set the effects platform to trigger on or shortly after the occurrence of the secondary
detection,
allowing the system to virtually guarantee the absence of preemptive triggering. As if these
arguments were
not
persuasive enough, another benefit of DTW is that if the system were to require multiple database
performances, then
section specific DTW-distance thresholds could be set for each part of a song as to ensure accuracy
for
multiple
trigger events or different effects triggered on different instances of specific recurring parts of
a song.
Most
importantly, because of the extensive resources already invested into the DTW algorithm, there is a
wealth
of
optimization options and configurations that enable computationally efficient approaches to be
selected
based on the
needs of a specific application.
</p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p>4.0 Chapter 4 Implementation</p>
<p>
4.1 Introduction When designing the system for the project, the group learned that there were many
different
methods to approach the problem. Hardware, software platforms, and analysis methods were components
that had
to be
carefully chosen to meet the full demands of the system. When designing the structure for the
project, the
group
went through many approaches before settling on the final architecture. The various criterion that
needed to
be met
entailed challenges such as mobility, robustness, speed, and usability. 4.2 First Approach The first
approach
consisted of using a physical component that would offset and amplify the guitar signal, an external
ADC,
and a
raspberry pi. The physical component was required because the signal of an electric guitar fell
within the
millivolt
range, and the external ADC operated with a minimum of 1 volt. Signals coming to the electric guitar
would
be fed
through the component and then onto the ADC. The raspberry pi would then read the discretized signal
through
a
python program. When the system detects a trigger event, it would output a signal through the built
in DAC
and
trigger a physical guitar pedal. This approach was dismissed due to the unnecessary use of a
microcontroller
and
physical effect pedals. To perform signal analysis, it is important to use a CPU with a fast enough
processing
speed. Although the raspberry pi has one of the fastest processing speeds on the market, it still
falls
short to the
speed a laptop can provide.
</p>
<p>
4.3 Python Implementation For the next approach of the system, a laptop and a quarter-inch to USB
adapter
was to be
used in lieu of the raspberry pi and external components. Python, due to the large number of
available
libraries,
was chosen for this implementation. Numpy, Scipy, and Matplotlib, are mathematical libraries that
allow
Python to
perform MATLAB like operations. To test the speed at which Python performed analysis on a live
signal, a
test module
was created to read in input from a microphone. Using the aforementioned data science libraries,
various
audio
analysis techniques were performed on the signal. The incoming signals were first stored in arrays
of size
4096
samples. The data was then converted from bytes to integers so that operations could be performed on