Internal API¶
Analysis-Synthesis based TSM procedures¶
The audiotsm.base.analysis_synthesis
module provides a base class for
real-time analysis-synthesis based audio time-scale modification procedures.
-
class
audiotsm.base.analysis_synthesis.
AnalysisSynthesisTSM
(converter, channels, frame_length, analysis_hop, synthesis_hop, analysis_window, synthesis_window, delta_before=0, delta_after=0)¶ A
audiotsm.base.tsm.TSM
for real-time analysis-synthesis based time-scale modification procedures.The basic principle of an analysis-synthesis based TSM procedure is to first decompose the input signal into short overlapping frames, called the analysis frames. The frames have a fixed length
frame_length
, and are separated byanalysis_hop
samples, as illustrated below:<--------frame_length--------><-analysis_hop-> Frame 1: [~~~~~~~~~~~~~~~~~~~~~~~~~~~~] Frame 2: [~~~~~~~~~~~~~~~~~~~~~~~~~~~~] Frame 3: [~~~~~~~~~~~~~~~~~~~~~~~~~~~~] ...
It then relocates the frames on the time axis by changing the distance between them (to
synthesis_hop
), as illustrated below:<--------frame_length--------><----synthesis_hop----> Frame 1: [~~~~~~~~~~~~~~~~~~~~~~~~~~~~] Frame 2: [~~~~~~~~~~~~~~~~~~~~~~~~~~~~] Frame 3: [~~~~~~~~~~~~~~~~~~~~~~~~~~~~] ...
This changes the speed of the signal by the ratio
analysis_hop / synthesis_hop
(for example, if thesynthesis_hop
is twice theanalysis_hop
, the output signal will be half as fast as the input signal).However this simple method introduces artifacts to the signal. These artifacts can be reduced by modifying the analysis frames by various methods. This is done by a
converter
object, which converts the analysis frames into modified frames called the synthesis frames.To further reduce the artifacts, window functions (the
analysis_window
and thesynthesis_window
) can be applied to the analysis frames and the synthesis frames in order to smooth the signal.Some TSM procedures (e.g. WSOLA-like methods) may need to have access to some samples preceeding or following an analysis frame to generate the synthesis frame. The delta_before and delta_after parameters allow to specify the numbers of samples needed before and after the analysis frame, so that they are available to the
converter
.For more details on Time-Scale Modification procedures, I recommend reading “A Review of Time-Scale Modification of Music Signals” by Jonathan Driedger and Meinard Müller.
Parameters: - converter (
Converter
) – an object that implements the conversion of the analysis frames into synthesis frames. - channels (int) – the number of channels of the input signal.
- frame_length (int) – the length of the frames.
- analysis_hop (int) – the number of samples between two consecutive analysis frames.
- synthesis_hop (int) – the number of samples between two consecutive synthesis frames.
- analysis_window (
numpy.ndarray
) – a window applied to the analysis frames - synthesis_window (
numpy.ndarray
) – a window applied to the synthesis frames - delta_before (int) – the number of samples preceding an analysis frame that the converter requires (this is usually 0, except for WSOLA-like methods)
- delta_after (int) – the number of samples following an analysis frame that the converter requires (this is usually 0, except for WSOLA-like methods)
- converter (
-
class
audiotsm.base.analysis_synthesis.
Converter
¶ A base class for objects implementing the conversion of analysis frames into synthesis frames.
-
clear
()¶ Clears the state of the Converter, making it ready to be used on another signal (or another part of a signal). It is called by the
clear()
method and the constructor ofAnalysisSynthesisTSM
.
-
convert_frame
(analysis_frame)¶ Converts an analysis frame into a synthesis frame.
Parameters: analysis_frame ( numpy.ndarray
) –a matrix of shape (
m
,delta_before + frame_length + delta_after
), withm
the number of channels, containing the analysis frame and some samples before and after (as specified by thedelta_before
anddelta_after
parameters of theAnalysisSynthesisTSM
calling theConverter
).analysis_frame[:, delta_before:-delta_after]
contains the actual analysis frame (without the samples preceeding and following it).Returns: a synthesis frame represented as a numpy.ndarray
of shape (m
,frame_length
), withm
the number of channels.
-
Circular buffers¶
The audiotsm.utils
module provides utility functions and classes used in
the implementation of time-scale modification procedures.
-
class
audiotsm.utils.
CBuffer
(channels, max_length)¶ A
CBuffer
is a circular buffer used to store multichannel audio data.It can be seen as a variable-size buffer whose length is bounded by
max_length
. TheCBuffer.write()
andCBuffer.right_pad()
methods allow to add samples at the end of the buffer, while theCBuffer.read()
andCBuffer.remove()
methods allow to remove samples from the beginning of the buffer.Contrary to the samples added by the
CBuffer.write()
andCBuffer.read_from()
, those added by theCBuffer.right_pad()
method are considered not to be ready to be read. Effectively, this means that they can be modified by theCBuffer.add()
andCBuffer.divide()
methods, but have to be marked as ready to be read with theCBuffer.set_ready()
method before being read with theCBuffer.peek()
,CBuffer.read()
, orCBuffer.write_to()
methods.Parameters: -
add
(buffer)¶ Adds a
buffer
element-wise to theCBuffer
.Parameters: buffer ( numpy.ndarray
) – a matrix of shape (m
,n
), withm
the number of channels andn
the length of the buffer.Raises: ValueError – if the CBuffer
and thebuffer
do not have the same number of channels or theCBuffer
is smaller than thebuffer
(self.length < n
).
-
divide
(array)¶ Divides each channel of the
CBuffer
element-wise by thearray
.Parameters: array ( numpy.ndarray
) – an array of shape (n
,).Raises: ValueError – if the length of the CBuffer
is smaller than the length of the array (self.length < n
).
-
peek
(buffer)¶ Reads as many samples from the
CBuffer
as possible, without removing them from theCBuffer
, writes them to thebuffer
, and returns the number of samples that were read.The samples need to be marked as ready to be read with the
CBuffer.set_ready()
method in order to be read. This is done automatically by theCBuffer.write()
andCBuffer.read_from()
methods.Parameters: buffer ( numpy.ndarray
) – a matrix of shape (m
,n
), withm
the number of channels andn
the length of the buffer, where the samples will be written.Returns: the number of samples that were read from the CBuffer
.Raises: ValueError – if the CBuffer
and thebuffer
do not have the same number of channels.
-
read
(buffer)¶ Reads as many samples from the
CBuffer
as possible, removes them from theCBuffer
, writes them to thebuffer
, and returns the number of samples that were read.The samples need to be marked as ready to be read with the
CBuffer.set_ready()
method in order to be read. This is done automatically by theCBuffer.write()
andCBuffer.read_from()
methods.Parameters: buffer ( numpy.ndarray
) – a matrix of shape (m
,n
), withm
the number of channels andn
the length of the buffer, where the samples will be written.Returns: the number of samples that were read from the CBuffer
.Raises: ValueError – if the CBuffer
and thebuffer
do not have the same number of channels.
-
read_from
(reader)¶ Reads as many samples as possible from
reader
, writes them to theCBuffer
, and returns the number of samples that were read.The written samples are marked as ready to be read.
Parameters: reader – a audiotsm.io.base.Reader
.Returns: the number of samples that were read from reader
.Raises: ValueError – if the CBuffer
andreader
do not have the same number of channels.
-
ready
¶ The number of samples that can be read.
-
remove
(n)¶ Removes the first
n
samples of theCBuffer
, preventing them to be read again, and leaving more space for new samples to be written.Parameters: n (int) – the number of samples to remove. Returns: the number of samples that were removed.
-
right_pad
(n)¶ Add zeros at the end of the
CBuffer
.The added samples are not marked as ready to be read. The
CBuffer.set_ready()
will need to be called in order to be able to read them.Parameters: n (int) – the number of zeros to add. Raises: ValueError – if there is not enough space to add the zeros.
-
set_ready
(n)¶ Mark the next
n
samples as ready to be read.Parameters: n (int) – the number of samples to mark as ready to be read. Raises: ValueError – if there is less than n
samples that are not ready yet.
-
to_array
()¶ Returns an array containing the same data as the
CBuffer
.Returns: a numpy.ndarray
of shape (m
,n
), withm
the number of channels andn
the length of the buffer.
-
write
(buffer)¶ Writes as many samples from the
buffer
to theCBuffer
as possible, and returns the number of samples that were read.The written samples are marked as ready to be read.
Parameters: buffer ( numpy.ndarray
) – a matrix of shape (m
,n
), withm
the number of channels andn
the length of the buffer, where the samples will be read.Returns: the number of samples that were written to the CBuffer
.Raises: ValueError – if the CBuffer
and thebuffer
do not have the same number of channels.
-
write_to
(writer)¶ Writes as many samples as possible to
writer
, deletes them from theCBuffer
, and returns the number of samples that were written.The samples need to be marked as ready to be read with the
CBuffer.set_ready()
method in order to be read. This is done automatically by theCBuffer.write()
andCBuffer.read_from()
methods.Parameters: writer – a audiotsm.io.base.Writer
.Returns: the number of samples that were written to writer
.Raises: ValueError – if the CBuffer
andwriter
do not have the same number of channels.
-
-
class
audiotsm.utils.
NormalizeBuffer
(length)¶ A
NormalizeBuffer
is a mono-channel circular buffer, used to normalize audio buffers.Parameters: length (int) – the length of the NormalizeBuffer
.-
add
(window)¶ Adds a window element-wise to the
NormalizeBuffer
.Parameters: window ( numpy.ndarray
) – an array of shape (n
,).Raises: ValueError – if the window is larger than the buffer ( n > self.length
).
-
length
¶ The length of the CBuffer.
-
remove
(n)¶ Removes the first
n
values of theNormalizeBuffer
.Parameters: n (int) – the number of values to remove.
-
to_array
(start=0, end=None)¶ Returns an array containing the same data as the
NormalizeBuffer
, from indexstart
(included) to indexend
(exluded).Returns: numpy.ndarray
-
Window functions¶
The audiotsm.utils.windows
module contains window functions used for
digital signal processing.
-
audiotsm.utils.windows.
apply
(buffer, window)¶ Applies a window to a buffer.
Parameters: - buffer (
numpy.ndarray
) – a matrix of shape (m
,n
), withm
the number of channels andn
the length of the buffer. - window – a
numpy.ndarray
of shape (n
,).
- buffer (
-
audiotsm.utils.windows.
hanning
(length)¶ Returns a periodic Hanning window.
Contrary to
numpy.hanning()
, which returns the symetric Hanning window,hanning()
returns a periodic Hanning window, which is better for spectral analysis.Parameters: length ( int
) – the number of points of the Hanning windowReturns: the window as a numpy.ndarray
of shape (length
,).
-
audiotsm.utils.windows.
product
(window1, window2)¶ Returns the product of two windows.
Parameters: - window1 – a
numpy.ndarray
of shape (n
,) orNone
. - window2 – a
numpy.ndarray
of shape (n
,) orNone
.
Returns: the product of the two windows. If one of the windows is equal to
None
, the other is returned, and if the two are equal toNone
,None
is returned.- window1 – a