How Pitch Shifting Works

A long-form, technical but readable explanation of resampling, phase vocoders, and why retuning changes duration.

If you have ever sped up a vinyl record, you already understand the core of pitch shifting. The sound becomes higher and shorter at the same time. Slow it down and it becomes lower and longer. That is the simplest and most transparent form of pitch shifting: resampling, or changing playback rate.

But modern audio tools often promise a different result: change pitch without changing duration, or change duration without changing pitch. That is where time stretching and algorithms like the phase vocoder come in.

This article walks through the main methods and why your retuned files sound the way they do.

Method 1: Resampling (playback rate)

Resampling changes the speed of audio playback. When the playback rate increases, each waveform cycle happens faster, so pitch rises. Because the cycles happen faster, the audio finishes sooner, so duration decreases.

This is the most direct and predictable method. It does not invent new information. It simply plays the existing signal at a different rate. In audio engineering terms, it is a change in sample rate without a compensating time-stretch step.

The result:

Pitch changes in proportion to the playback rate
Duration changes by the inverse of the same ratio
No time stretching artifacts

This is the method used by our tool. It is simple, transparent, and reliable for retuning to a new reference frequency.

Time stretching aims to change duration without changing pitch. A common approach uses the Short Time Fourier Transform (STFT). The signal is split into overlapping windows, transformed into the frequency domain, and then re-synthesized with modified time spacing.

A phase vocoder tracks the phase relationships between successive frames to reduce discontinuities during reconstruction. It is powerful but can introduce artifacts such as smearing, transient blurring, or a watery texture. These artifacts are why extreme time stretching often sounds unnatural.

Many commercial tools combine time stretching with pitch shifting, but the tradeoff is that more processing tends to mean more artifacts.

Method 3: Hybrid and modern algorithms

Modern pitch shifting tools often use hybrid methods that preserve transients or separate harmonic and percussive components. Some are excellent. Some are not. The key point is that every algorithm makes choices about what to preserve and what to alter.

For simple retuning, a direct playback rate change is often the clearest choice. It keeps the signal consistent, avoids algorithmic artifacts, and produces an output that is faithful to the source recording.

Why duration changes are expected

When you retune by changing playback rate, duration changes. This is not a bug. It is the expected physical relationship between time and pitch in a sampled signal. If you want to preserve duration, you need additional processing, and that processing will inevitably alter the signal in other ways.

So if you notice that your retuned file is a little shorter or longer, that is normal and correct for a transparent retune.

Practical takeaway

If your goal is a clean, consistent retune to a specific frequency standard, resampling is the most honest method. If your goal is to preserve duration at all costs, then time stretching tools may be useful, but be aware of the artifacts they can introduce.

Retune your music now: /

How Pitch Shifting Works

Method 1: Resampling (playback rate)

Method 2: Time stretching (phase vocoder and related methods)

Method 3: Hybrid and modern algorithms

Why duration changes are expected

Practical takeaway

References