Time-Scale Modification¶
Time-Scale Modification procedures¶
The audiotsm
module provides several time-scale modification procedures:
ola()
(Overlap-Add);wsola()
(Waveform Similarity-based Overlap-Add);phasevocoder()
(Phase Vocoder).
The OLA procedure should only be used on percussive audio signals. The WSOLA and the Phase Vocoder procedures are improvements of the OLA procedure, and should both give good results in most cases.
Note
If you are unsure which procedure and parameters to choose, using
phasevocoder()
with the default parameters should give good
results in most cases. You can listen to the output of the different
procedures on various audio files and at various speeds on the examples
page.
Each of the function of this module returns a TSM
object which implements a time-scale modification procedure.
-
audiotsm.
ola
(channels, speed=1.0, frame_length=256, analysis_hop=None, synthesis_hop=None)¶ Returns a
TSM
object implementing the OLA (Overlap-Add) time-scale modification procedure.In most cases, you should not need to set the
frame_length
, theanalysis_hop
or thesynthesis_hop
. If you want to fine tune these parameters, you can check the documentation of theAnalysisSynthesisTSM
class to see what they represent.Parameters: - channels (int) – the number of channels of the input signal.
- speed (float, optional) – the speed ratio by which the speed of the signal will be
multiplied (for example, if
speed
is set to 0.5, the output signal will be half as fast as the input signal). - frame_length (int, optional) – the length of the frames.
- analysis_hop (int, optional) – the number of samples between two consecutive analysis
frames (
speed * synthesis_hop
by default). Ifanalysis_hop
is set, thespeed
parameter will be ignored. - synthesis_hop (int, optional) – the number of samples between two consecutive
synthesis frames (
frame_length // 2
by default).
Returns: a
audiotsm.base.tsm.TSM
object
-
audiotsm.
wsola
(channels, speed=1.0, frame_length=1024, analysis_hop=None, synthesis_hop=None, tolerance=None)¶ Returns a
TSM
object implementing the WSOLA (Waveform Similarity-based Overlap-Add) time-scale modification procedure.In most cases, you should not need to set the
frame_length
, theanalysis_hop
, thesynthesis_hop
, or thetolerance
. If you want to fine tune these parameters, you can check the documentation of theAnalysisSynthesisTSM
class to see what the first three represent.WSOLA works in the same way as OLA, with the exception that it allows slight shift (at most
tolerance
) of the position of the analysis frames.Parameters: - channels (int) – the number of channels of the input signal.
- speed (float, optional) – the speed ratio by which the speed of the signal will be
multiplied (for example, if
speed
is set to 0.5, the output signal will be half as fast as the input signal). - frame_length (int, optional) – the length of the frames.
- analysis_hop (int, optional) – the number of samples between two consecutive analysis
frames (
speed * synthesis_hop
by default). Ifanalysis_hop
is set, thespeed
parameter will be ignored. - synthesis_hop (int, optional) – the number of samples between two consecutive
synthesis frames (
frame_length // 2
by default). - tolerance (int) – the maximum number of samples that the analysis frame can be shifted.
Returns: a
audiotsm.base.tsm.TSM
object
-
audiotsm.
phasevocoder
(channels, speed=1.0, frame_length=2048, analysis_hop=None, synthesis_hop=None)¶ Returns a
TSM
object implementing the phase vocoder time-scale modification procedure.In most cases, you should not need to set the
frame_length
, theanalysis_hop
or thesynthesis_hop
. If you want to fine tune these parameters, you can check the documentation of theAnalysisSynthesisTSM
class to see what they represent.Parameters: - channels (int) – the number of channels of the input signal.
- speed (float, optional) – the speed ratio by which the speed of the signal will be
multiplied (for example, if
speed
is set to 0.5, the output signal will be half as fast as the input signal). - frame_length (int, optional) – the length of the frames.
- analysis_hop (int, optional) – the number of samples between two consecutive analysis
frames (
speed * synthesis_hop
by default). Ifanalysis_hop
is set, thespeed
parameter will be ignored. - synthesis_hop (int, optional) – the number of samples between two consecutive
synthesis frames (
frame_length // 4
by default).
Returns: a
audiotsm.base.tsm.TSM
object
TSM Object¶
The audiotsm.base.tsm
module provides an abstract class for real-time
audio time-scale modification procedures.
-
class
audiotsm.base.tsm.
TSM
¶ An abstract class for real-time audio time-scale modification procedures.
If you want to use a
TSM
object to run a TSM procedure on a signal, you should use therun()
method in most cases.-
clear
()¶ Clears the state of the
TSM
object, making it ready to be used on another signal (or another part of a signal).This method should be called before processing a new file, or seeking to another part of a signal.
-
flush_to
(writer)¶ Writes as many output samples as possible to
writer
, assuming that there are no remaining samples that will be added to the input (i.e. that thewrite_to()
method will not be called), and returns the number of samples that were written.Parameters: writer – a audiotsm.io.base.Writer
.Returns: a tuple ( n
,finished
), with:n
the number of samples that were written towriter
finished
a boolean that isTrue
when there are no samples remaining to flush.
Return type: (int, bool)
-
get_max_output_length
(input_length)¶ Returns the maximum number of samples that will be written to the output given the numver of samples of the input.
Parameters: input_length (int) – the number of samples of the input. Returns: the maximum number of samples that will be written to the output.
-
read_from
(reader)¶ Reads as many samples as possible from
reader
, processes them, and returns the number of samples that were read.Parameters: reader – a audiotsm.io.base.Reader
.Returns: the number of samples that were read from reader
.
-
run
(reader, writer, flush=True)¶ Runs the TSM procedure on the content of
reader
and writes the output towriter
.Parameters: - reader – a
audiotsm.io.base.Reader
. - writer – a
audiotsm.io.base.Writer
. - flush (bool, optional) –
True
if there is no more data to process.
- reader – a
-
set_speed
(speed)¶ Sets the speed ratio.
Parameters: speed (float) – the speed ratio by which the speed of the signal will be multiplied (for example, if speed
is set to 0.5, the output signal will be half as fast as the input signal).
-
write_to
(writer)¶ Writes as many result samples as possible to
writer
.Parameters: writer – a audiotsm.io.base.Writer
.Returns: a tuple ( n
,finished
), with:n
the number of samples that were written towriter
finished
a boolean that isTrue
when there are no samples remaining to write. In this case, theread_from()
method should be called to add new input samples, or, if there are no remaining input samples, theflush_to()
method should be called to get the last output samples.
Return type: (int, bool)
-