Time-Scale Modification¶
Time-Scale Modification procedures¶
The audiotsm module provides several time-scale modification procedures:
ola()(Overlap-Add);wsola()(Waveform Similarity-based Overlap-Add);phasevocoder()(Phase Vocoder).
The OLA procedure should only be used on percussive audio signals. The WSOLA and the Phase Vocoder procedures are improvements of the OLA procedure, and should both give good results in most cases.
Note
If you are unsure which procedure and parameters to choose, using
phasevocoder() with the default parameters should give good
results in most cases. You can listen to the output of the different
procedures on various audio files and at various speeds on the examples
page.
Each of the function of this module returns a TSM
object which implements a time-scale modification procedure.
-
audiotsm.ola(channels, speed=1.0, frame_length=256, analysis_hop=None, synthesis_hop=None)¶ Returns a
TSMobject implementing the OLA (Overlap-Add) time-scale modification procedure.In most cases, you should not need to set the
frame_length, theanalysis_hopor thesynthesis_hop. If you want to fine tune these parameters, you can check the documentation of theAnalysisSynthesisTSMclass to see what they represent.Parameters: - channels (int) – the number of channels of the input signal.
- speed (float, optional) – the speed ratio by which the speed of the signal will be
multiplied (for example, if
speedis set to 0.5, the output signal will be half as fast as the input signal). - frame_length (int, optional) – the length of the frames.
- analysis_hop (int, optional) – the number of samples between two consecutive analysis
frames (
speed * synthesis_hopby default). Ifanalysis_hopis set, thespeedparameter will be ignored. - synthesis_hop (int, optional) – the number of samples between two consecutive
synthesis frames (
frame_length // 2by default).
Returns: a
audiotsm.base.tsm.TSMobject
-
audiotsm.wsola(channels, speed=1.0, frame_length=1024, analysis_hop=None, synthesis_hop=None, tolerance=None)¶ Returns a
TSMobject implementing the WSOLA (Waveform Similarity-based Overlap-Add) time-scale modification procedure.In most cases, you should not need to set the
frame_length, theanalysis_hop, thesynthesis_hop, or thetolerance. If you want to fine tune these parameters, you can check the documentation of theAnalysisSynthesisTSMclass to see what the first three represent.WSOLA works in the same way as OLA, with the exception that it allows slight shift (at most
tolerance) of the position of the analysis frames.Parameters: - channels (int) – the number of channels of the input signal.
- speed (float, optional) – the speed ratio by which the speed of the signal will be
multiplied (for example, if
speedis set to 0.5, the output signal will be half as fast as the input signal). - frame_length (int, optional) – the length of the frames.
- analysis_hop (int, optional) – the number of samples between two consecutive analysis
frames (
speed * synthesis_hopby default). Ifanalysis_hopis set, thespeedparameter will be ignored. - synthesis_hop (int, optional) – the number of samples between two consecutive
synthesis frames (
frame_length // 2by default). - tolerance (int) – the maximum number of samples that the analysis frame can be shifted.
Returns: a
audiotsm.base.tsm.TSMobject
-
audiotsm.phasevocoder(channels, speed=1.0, frame_length=2048, analysis_hop=None, synthesis_hop=None)¶ Returns a
TSMobject implementing the phase vocoder time-scale modification procedure.In most cases, you should not need to set the
frame_length, theanalysis_hopor thesynthesis_hop. If you want to fine tune these parameters, you can check the documentation of theAnalysisSynthesisTSMclass to see what they represent.Parameters: - channels (int) – the number of channels of the input signal.
- speed (float, optional) – the speed ratio by which the speed of the signal will be
multiplied (for example, if
speedis set to 0.5, the output signal will be half as fast as the input signal). - frame_length (int, optional) – the length of the frames.
- analysis_hop (int, optional) – the number of samples between two consecutive analysis
frames (
speed * synthesis_hopby default). Ifanalysis_hopis set, thespeedparameter will be ignored. - synthesis_hop (int, optional) – the number of samples between two consecutive
synthesis frames (
frame_length // 4by default).
Returns: a
audiotsm.base.tsm.TSMobject
TSM Object¶
The audiotsm.base.tsm module provides an abstract class for real-time
audio time-scale modification procedures.
-
class
audiotsm.base.tsm.TSM¶ An abstract class for real-time audio time-scale modification procedures.
If you want to use a
TSMobject to run a TSM procedure on a signal, you should use therun()method in most cases.-
clear()¶ Clears the state of the
TSMobject, making it ready to be used on another signal (or another part of a signal).This method should be called before processing a new file, or seeking to another part of a signal.
-
flush_to(writer)¶ Writes as many output samples as possible to
writer, assuming that there are no remaining samples that will be added to the input (i.e. that thewrite_to()method will not be called), and returns the number of samples that were written.Parameters: writer – a audiotsm.io.base.Writer.Returns: a tuple ( n,finished), with:nthe number of samples that were written towriterfinisheda boolean that isTruewhen there are no samples remaining to flush.
Return type: (int, bool)
-
get_max_output_length(input_length)¶ Returns the maximum number of samples that will be written to the output given the numver of samples of the input.
Parameters: input_length (int) – the number of samples of the input. Returns: the maximum number of samples that will be written to the output.
-
read_from(reader)¶ Reads as many samples as possible from
reader, processes them, and returns the number of samples that were read.Parameters: reader – a audiotsm.io.base.Reader.Returns: the number of samples that were read from reader.
-
run(reader, writer, flush=True)¶ Runs the TSM procedure on the content of
readerand writes the output towriter.Parameters: - reader – a
audiotsm.io.base.Reader. - writer – a
audiotsm.io.base.Writer. - flush (bool, optional) –
Trueif there is no more data to process.
- reader – a
-
set_speed(speed)¶ Sets the speed ratio.
Parameters: speed (float) – the speed ratio by which the speed of the signal will be multiplied (for example, if speedis set to 0.5, the output signal will be half as fast as the input signal).
-
write_to(writer)¶ Writes as many result samples as possible to
writer.Parameters: writer – a audiotsm.io.base.Writer.Returns: a tuple ( n,finished), with:nthe number of samples that were written towriterfinisheda boolean that isTruewhen there are no samples remaining to write. In this case, theread_from()method should be called to add new input samples, or, if there are no remaining input samples, theflush_to()method should be called to get the last output samples.
Return type: (int, bool)
-