Honours Project - Creative Vocal Sound Synthesis: January 2014

Friday, 31 January 2014

3rd Supervisor Meeting

Progress Report

Continued familiarisation with Max MSP using the objects Buffer~ and Groove~ to create a time stretching feature for samples, current optimum playback speed - 0.25% of the original sample.
Revisited the human sound source, influenced by a sound effects tutorial which uses time stretching on short coughs.
Submitted showcase proposal.
Haven't mastered Plot~ to generate time-based graphs against attributes like F0.
Integrated overlapping spectrum analysis in the Gizmo~ (1.0) and Groove~ (2.0) prototypes.
Sourced a Max MSP Granular tutorial to incorporate these synthesis features in future builds; http://www.cs.au.dk/~dsound/DigitalAudio.dir/Papers/IntroToGranSynth.pdf

Agreed Action Points

Start including the use of Record~ working with Buffer~ in future builds, working towards the prospective real-time driven elements of the prototype.
Build and test Granular sound synthesis components.
Revisit Plot~ to generate temporal analysis graphs.
Investigate the inclusion of customizable master effects that can be automated.
Time stretching that can be editted during playback with the possible use of a fader.

*Found a really useful Max/MSP tutorial source aimed at beginners that discusses a variety of topics such as basic audio techniques, sequencing and synthesis. Also handy to get another resource to help increase object knowledge, and application.

http://www.paulschuette.com/wp-content/uploads/2013/01/DEMYSTIFYING-MAXMSP.pdf

Wednesday, 29 January 2014

Buffer~ and Groove~

Buffer~ "buffer~ works as a buffer of memory in which samples are stored to be saved, edited, or referenced in conjunction with many different objects, including play~ / groove~ (to play the buffer), record~(records into the buffer), info~ (to report information about the buffer), peek~ (to write into/read from the buffer like the table object), lookup~ (to use the buffer for waveshaping), cycle~ (to specify a 512-point waveform), and wave~ (to specify a waveform)."
Groove~ "groove~ is a variable-rate, looping, sample-playback object which references the audio information stored in a buffer~ object carrying the same name."

I have incorporated these into prototype 2.0 allowing for the time stretching of a human sound source. The result is pretty impressive considering the use of coughs for the process input.
The design challenge will be incorporating this into something that almost works in real time, given the inherent delay for the time stretch.

The wave degeneration experienced with gizmo~;

Cough01 normal.

Cough 01 through gizmo set at minus one octave.

Cough 01 through gizmo set at minus two octaves.

Tuesday, 28 January 2014

Roaring

A biologically based roaring framework as provided by Weissengruber;

"We suggest that roaring (the low pitch vocalization seen in

prototypical form in lions and red deer) has two distinct

physiological and acoustic components:

1 a low fundamental frequency, made possible by long or

heavy vocal folds, which lead to the low pitch of the roar;

2 lowered formant frequencies, made possible by an

elongated vocal tract, which provide the impressive

baritone timbre of roars."

This highlights possible principles to inform the design to getting an 'accurate' roar sound.

Weissengruber, G. E. et al. 2002. Hyoid apparatus and pharynx in the lion (Panthera leo), jaguar (Panthera onca), tiger (Panthera tigris), cheetah (Acinonyx jubatus) and domestic cat (Felis silvestris f. catus). Journal of Anatomy. 201(3): pp.195-209.

From Cough To Roar;

An interesting and amazingly short process that uses a short cough for the input sound and then time stretches it.

The result is very impressive.

This just shows you have to experiment, and not go for the obvious "RRRRRRRRRRAAAAAAAWR!"

Will try and experiment with this in future prototypes.

This use of pitch shifting may address altering the Source section within using the Source-Filter framework within this design. It also addresses the first condition outlined by Weissengruber's roaring framework, to produce a sound that features low fundamental frequency.
The next stage would be to do the filtering steps, looking at formants to suffice the Filter side of the theory.

In terms of utilising time stretching in MSP;

http://www.devinkerr.com/2008/10/30/free_elastic-independent-pitchspeed-control-in-max/

Another paper prototype detailing the analytical process;

David Farmer AMA

https://www.facebook.com/photo.php?fbid=10151753385296286&set=a.101318681285.104099.7513286285&type=1&theater

Amy Edwards So many dragons on screen seem to have similar voices and sounds and this dragon and Merlin's (John Hurt's voice) dragon are my top two favorites. Where did you get the inspiration for your dragon and where or what were you modeling his sound from?

1 · 12 December 2013 at 15:26

David Farmer The FIRST thing I wanted was for the beast part of Smaug to be an alligator. I love the sound of an alligator growl, but have never heard it featured anywhere for a film beast. I didn't want him to just be a lion or tiger, or some pitched version of those and the alligator fit the bill. I'd heard alligators before but never had the chance to record them. I want to thanks Colin Hart for sharing his knowledge of Alligators which led me to get several days worth of recordings.

7 · 12 December 2013 at 16:59

Jackie Cooper What was the most complicated sound to perfect?

12 December 2013 at 14:58

David Farmer IMO creature sounds are the single hardest to make, and make them interesting and also believable. There are only so many animals in the world to record, yet visually they can be anything you can imagine. So we're working with a finite source palette, to satisfy a limitless visual potential. We can also only twist them so far before they sound unrealistic.

4 · 12 December 2013 at 15:11

James Henry Pendziszewski What did you do to make Benedict Cumberbatch's voice more dragon-like for Smaug?

12 December 2013 at 14:57

David Farmer It helps to start with a great performance which Benedict gave us in droves. His voice is arguably the best I've been able to work with. The key was the right amount of pitching, with other low end enhancers, for the most part. But then there was the reverbs to make it sound like it excited the space, as well as vocoding alligator growls to add an extra layer of girth underneath. There were quite a few processes, some added quite subtley, but in the end the sum of the parts created something I hope sounds natural, but yet authentic.

6 · 12 December 2013 at 15:15

Details the process and rationale David Farmer applied for processing Benedict Cumberbatch in The Hobbit: The Desolation of Smaug.

A short example from the 'Lion King' studio sessions featuring the human voice with the use of a hardware resonator in the form of metal bin to accentuate and attenuate certain frequency formants.

Monday, 27 January 2014

Max Isolated Analysis Update

Using ~plot to create overlapping analysis views of the human and animal sounds that run simultaneously.

Not the optimal analysis option, still working on the prospect to get a graph similar to WASP's that displays Fundamental Frequency against time.

Friday, 24 January 2014

2nd Supervisor Meeting

Meeting Agenda - Plan of action.

Progress Report;

Familiarisation with Max MSP, comparing frequency shifting against pitch shifting. Pitch shifting sounds cleaner and more appropriate
Using a sample in playback for more efficient tailoring to material
Attempting in-house analysis and monitoring

Agreed Action Points

Different animal settings
Analysis inclusion to create measurable parameters and data
Usability features
Master Mix, Eq and Limiter

Monday, 20 January 2014

Using sfplay~

Sfplay~ "Use the sfplay~ object to play plays AIFF, NeXT/SUN(.au), WAVE, and Raw Data files of 1-32 tracks from disk."

Using the imitation sound file from the pre-production portfolio 'HumanRoar01' serves as the sound effect as the start of the chain using a loopable sound effect.
Gizmo~ Has proven to be more challenging to implement when compared with Freqshift~ as it relies on the use of Fast-Fourier transform protocols. Using the unlock feature in the help window offers a quick implementation of the feature.

Saturday, 18 January 2014

1st Supervisor Session

The 1st Supervisor Session went well addressing the viability of the proposed research and it's potential applications. The real-time feature whilst potentially challenging, would be a worthwhile practical pursuit as it is really useful tool and is engaging to practitioners.
The choice to use Max MSP as the core prototyping and design tool was supported given its range of uses and applications, along with its support in the forms of practical documentation, tutorials, patchs, etc.

Useful practical elements were advised and discussed for the prototypes development. Such as potentially using a procedural/ granular element to introduce variety and dynamic to the output stage.

Useful tutorial documentation concerning Granular Synthesis:
http://cycling74.com/wiki/index.php?title=MSP_Polyphony_Tutorial_3:_Granular_Synthesis

Prototype Plan

For the Max prototypes, they will be broken down into the following parts;

Tackling and augmenting the Source signals
Constructing the filters to shape the sound
Adding a Granular component

That will form the design approach of turning this human roar;

Analysed by Praat.

Into this tiger roar;

Analysed by Praat.

Friday, 17 January 2014

Oscilloscope Tutorial

http://www.cycling74.com/docs/max5/tutorials/msp-tut/mspchapter24.html

Thursday, 16 January 2014

Fast Fourier Transform Tutorial

Max MSP tutorial using fft~

Available http://www.cycling74.com/docs/max5/tutorials/msp-tut/mspchapter26.html

Have done a paper prototype detailing a take on the Source-Filter theory;

Interesting Max MSP Components

Max MSP components are defined by their use of a Tilda~ to separate them from the other components available in Max.
Potentially useful components include;

Fzero~ "The fzero~ object estimates the fundamental frequency of an an incoming, monophonic audio signal. It performs multiple layers of wavelet transforms on an incoming vector, comparing the spacing between the peaks in each"
Gizmo~ "The gizmo~ object implements a frequency-domain pitch shifter. It works by analyzing the frequency bins of an FFT'd signal, finding the peaks in the spectrum, and shifting them along the frequency axis to transpose the sound."
Freqshift~ "freqshift~ is a time-domain frequency shifter (also known as a single-sideband ring modulator)."
Plot~ "Use the plot~ object to graph sets of data as points across a domain. The source of the data to be visualized may be a Max list, an MSP buffer~, or an audio signal. The number of plots may be changed with the @numplots attribute, and each of these "subplots" is addressed through a dedicated inlet. A variety of out-of-the-box configurations are provided as Max object prototypes."
Capture~ "Use the capture object to collect signal values for signal debugging or investigation. To record signal values, use the record~ or sfrecord~ object."

Quotes taken from: http://cycling74.com/docs/max6/dynamic/c74_docs.html

Wednesday, 15 January 2014

Workflow and Max MSP, First Max MSP Test

Workflow Diagram

The workflow diagram demonstrates the relationship between the proposed programs along with the key stages for the prototypes concurrent development. Additional programs may also be utilised, but this image details the current core programs considered for the prototypes development at this current stage.

Max Test

A simple test with a Subtractive Synthesis principle using cascading bandpass filters on a tone generator sound source. The test also contains monitor points before and after the filtering stage.

The Max MSP components used include;

Number~
"Use the number~ object to display signal values or generate them."
Cascade~
"Use the cascade~ to filter an input signal using a series of biquad filters."
Spectroscope~
"spectroscope~ serves as a visual spectrogram or sonogram interface for the analysis of signals."
Filtergraph~
"Use the filtergraph~ object to generate filter coefficients for the biquad~ or cascade~ objects with a graphical interface."
Times~
"*~ is a signal multiplier-operator that outputs a signal which is the multiplication between two signals."
Ezdac~
"ezdac~ works as a user interface version of the dac~ object. It appears as a button which can be clicked with the mouse to turn audio on or off."

The learning curve experienced within Max has been quite negligible with it's searchable object database, help patches and reference notes all nicely integrated into the program.

Quotes taken from: http://cycling74.com/docs/max6/dynamic/c74_docs.html

Wednesday, 8 January 2014

Sourced Sounds

I have procured some sounds for reference in the form of tiger roars from sound-rangers.com that will be used to generate waveform data to analyse and inform the synthesis requirements for the human vocalisations.

Project Tools

These software tools have been looked at as part of the pre-production stage of the research project.

WASP
WASP’s user interface offers waveform, spectrographic and fundamental frequency displays.

Dehumaniser

Dehumaniser is a program that’s optimised to process the human voice in real time. It serves as both an influence, and as a comparative sound synthesis tool. The input option to either process an existing sound file or use a live sound input for processing in real time is another desirable feature to integrate into the prototypes design. An example detailing what can be created and developed with Max.

Max MSP
Cycling ‘74’s Max software offers a modular toolkit to design, construct and run various creative media applications, along with the option of operating them in real time. The programs blank canvas layout along with its open environment helps to present an intuitive workspace to create applications using practically based patching and routing in windows called ‘Patchers’.

References

Stanley R. Alten, 2010. Audio in Media. 9 Edition. Cengage Learning.
Robin Beauchamp, 2005. Designing Sound for Animation. Pap/DVD Edition. Focal Press. P.20
Kenny, T. 2000. Sound for Picture 2ND ED. (Mix Pro Audio Series). 002 Edition. Artistpro.
Sonnenschein, D. 2001. Sound Design: The Expressive Power of Music, Voice and Sound Effects in Cinema. 1 Edition. Michael Wiese Productions.
Russ, Martin. 2004. Sound Synthesis and Sampling (Music Technology). 2nd Edition. Focal Press. PAGES 11, 12, 13

Honours Project - Creative Vocal Sound Synthesis

Sections