Hello,
i can successfully match music using shazamkit on Apple using SwiftUI, a simple app that let user to load an audio file and exctracts the relative match, while i am unable to match music using shamzamkit on Android. I am trying to make the same simple app but i cannot match music as i get MATCH_ATTEMPT_FAILED every time i try to. I don't know what i am doing wrong but the shazam part in the kotlin Android code is in this method :
suspend fun processAudioFileInBackground(
filePath: String,
developerTokenProvider: DeveloperTokenProvider
) = withContext(Dispatchers.IO) {
val bufferSize = 1024 * 1024
val audioFile = FileInputStream(filePath)
val byteBuffer = ByteBuffer.allocate(bufferSize)
byteBuffer.order(ByteOrder.LITTLE_ENDIAN)
var bytesRead: Int
while (audioFile.read(byteBuffer.array()).also { bytesRead = it } != -1) {
val signatureGenerator = (ShazamKit.createSignatureGenerator(AudioSampleRateInHz.SAMPLE_RATE_44100) as ShazamKitResult.Success).data
signatureGenerator.append(byteBuffer.array(), bytesRead, System.currentTimeMillis())
val signature = signatureGenerator.generateSignature()
println("Signature: ${signature.durationInMs}")
val catalog = ShazamKit.createShazamCatalog(developerTokenProvider, Locale.ENGLISH)
val session = (ShazamKit.createSession(catalog) as ShazamKitResult.Success).data
val matchResult = session.match(signature)
println("MatchResult : $matchResult")
setMatchResult(matchResult)
byteBuffer.clear()
}
audioFile.close()
}
I noticed that changing Locale in catalog creation results in different result as i get NoMatch without exception. Can you please help me with this? Do i need to create a custom catalog?
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
According to the documentation (https://developer.apple.com/documentation/avfoundation/avplayeritem/externalmetadata), AVPlayerItem should have an externalMetadata property. However it does not appear to be visible to my app. When I try, I get:
Value of type 'AVPlayerItem' has no member 'externalMetadata'
Documentation states iOS 12.2+; I am building with a minimum deployment target of iOS 18.
Code snippet:
import Foundation
import AVFoundation
/// ... in function ...
// create metadata as described in https://developer.apple.com/videos/play/wwdc2022/110338
var title = AVMutableMetadataItem()
title.identifier = .commonIdentifierAlbumName
title.value = "My Title" as NSString?
title.extendedLanguageTag = "und"
var playerItem = await AVPlayerItem(asset: composition)
playerItem.externalMetadata = [ title ]
Hi everyone,
I’m trying to use AVAssetResourceLoaderDelegate to handle a live radio stream (e.g. Icecast/HTTP stream). My goal is to have access to the last 30 seconds of audio data during playback, so I can analyze it for specific audio patterns in near-real-time.
I’ve implemented a custom resource loader that works fine for podcasts and static files, where the file size and content length are known. However, for infinite live streams, my current implementation stops receiving new loading requests after the first one is served. As a result, the playback either stalls or fails to continue.
Has anyone successfully used AVAssetResourceLoaderDelegate with a continuous radio stream? Or maybe you can suggest betterapproach for buffering and analyzing live audio?
Any tips, examples, or advice would be appreciated. Thanks!
I developed an educational app that implements audio-video communication through RTC, while using WebView to display course materials during classes. However, some users are experiencing an issue where the audio playback from WebView is very quiet. I've checked that the AVAudioSessionCategory is set by RTC to AVAudioSessionCategoryPlayAndRecord, and the AVAudioSessionCategoryOption also includes AVAudioSessionCategoryOptionMixWithOthers. What could be causing the WebView audio to be suppressed, and how can this be resolved?
Hi, In my project I am using AVFoundation for recording the audio. We are using AVAudioMixerNode class below method to record the audio packet.
**func installTap(
onBus bus: AVAudioNodeBus,
bufferSize: AVAudioFrameCount,
format: AVAudioFormat?,
block tapBlock: @escaping AVAudioNodeTapBlock
)
**
It works perfectly fine.
But in production env some small percentage of the user we are facing issue like after recording few packets it stops automatically without stopping the audio engine. Can anyone help here that why this happens? I have also observed for mediaServicesWereResetNotification and added log on receiving this notification but when this issue happens I don't see any occurence of this log. Also is there any callback when the engine stops?
In my app I use AVAssetReaderTrackOutput to extract PCM audio from a user-provided video or audio file and display it as a waveform.
Recently a user reported that the waveform is not in sync with his video, and after receiving the video I noticed that the waveform is in fact double as long as the video duration, i.e. it shows the audio in slow-motion, so to speak.
Until now I was using
CMFormatDescription.audioStreamBasicDescription.mSampleRate
which for this particular user video returns 22'050. But in this case it seems that this value is wrong... because the audio file has two audio channels with different sample rates, as returned by
CMFormatDescription.audioFormatList.map({ $0.mASBD.mSampleRate })
The first channel has a sample rate of 44'100, the second one 22'050. If I use the first sample rate, the waveform is perfectly in sync with the video.
The problem is given by the fact that the ratio between the audio data length and the sample rate multiplied by the audio duration is 8, double the ratio for the first audio file (4). In the code below this ratio is given by
Double(length) / (sampleRate * asset.duration.seconds)
When commenting out the line with the sampleRate variable definition in the code below and uncommenting the following line, the ratios for both audio files are 4, which is the expected result. I would expect audioStreamBasicDescription to return the correct sample rate, i.e. the one used by AVAssetReaderTrackOutput, which (I think) somehow merges the stereo tracks. The documentation is sparse, and in particular it’s not documented whether the lower or higher sample rate is used; in this case, it seems like the higher one is used, but audioStreamBasicDescription for some reason returns the lower one.
Does anybody know why this is the case or how I should extract the sample rate of the produced PCM audio data? Should I always take the higher one?
I created FB19620455.
let openPanel = NSOpenPanel()
openPanel.allowedContentTypes = [.audiovisualContent]
openPanel.runModal()
let url = openPanel.urls[0]
let asset = AVURLAsset(url: url)
let assetTrack = asset.tracks(withMediaType: .audio)[0]
let assetReader = try! AVAssetReader(asset: asset)
let readerOutput = AVAssetReaderTrackOutput(track: assetTrack, outputSettings: [AVFormatIDKey: Int(kAudioFormatLinearPCM), AVLinearPCMBitDepthKey: 16, AVLinearPCMIsBigEndianKey: false, AVLinearPCMIsFloatKey: false, AVLinearPCMIsNonInterleaved: false])
readerOutput.alwaysCopiesSampleData = false
assetReader.add(readerOutput)
let formatDescriptions = assetTrack.formatDescriptions as! [CMFormatDescription]
let sampleRate = formatDescriptions[0].audioStreamBasicDescription!.mSampleRate
//let sampleRate = formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate }).max()!
print(formatDescriptions[0].audioStreamBasicDescription!.mSampleRate)
print(formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate }))
if !assetReader.startReading() {
preconditionFailure()
}
var length = 0
while assetReader.status == .reading {
guard let sampleBuffer = readerOutput.copyNextSampleBuffer(), let blockBuffer = sampleBuffer.dataBuffer else {
break
}
length += blockBuffer.dataLength
}
print(Double(length) / (sampleRate * asset.duration.seconds))
Since iOS 18, the system setting “Allow Audio Playback” (enabled by default) allows third-party app audio to continue playing while the user is recording video with the Camera app. This has created a problem for the app I’m developing.
➡️ The problem:
My app plays continuous audio in both foreground and background states. If the user starts recording video using the iOS Camera app, the app’s audio — still playing in the background — gets captured in the video — obviously an unintended behavior.
Yes, the user could stop the app manually before starting the video recording, but that can’t be guaranteed. As a developer, I need a way to stop the app’s audio before the video recording begins.
So far, I haven’t found a reliable way to detect when video recording starts if ‘Allow Audio Playback’ is ON.
➡️ What I’ve tried:
— AVAudioSession.interruptionNotification → doesn’t fire
— devicesChangedEventStream → not triggered
I don’t want to request mic permission (app doesn’t use mic). also, disabling the app from playing audio in the background isn’t an option as it is a crucial part of the user experience
➡️ What I need:
A reliable, supported way to detect when the Camera app begins video recording, without requiring mic access — so I can stop audio and avoid unintentional overlap with the user’s recordings.
Any official guidance, workarounds, or AVFoundation techniques would be greatly appreciated.
Thanks.
I'm working on a v2 Audio Unit that has some complicated internal state (audio, midi, other settings).
When the internal state changes, I want to inform the host (f.i. Logic Pro) that my plugin state has changed, and that the main window should show the 'project changed' status through the window close button.
This was easy to achieve for the VST version of the plugin, but I can't figure out a way to do it for the Audio Unit.
I've tried:
Notifying change of the kAudioUnitProperty_ClassInfo property that stores the plugin state:
unit->PropertyChanged(kAudioUnitProperty_ClassInfo, kAudioUnitScope_Global, 0);
Setting the kAudioUnitProperty_ClassInfo property value each time the plugin state changes.
Adding a new parameter called 'dirtystate' and toggling it and notifying the change each time the plugin state changes.
But nothing really make Logic take notice. This should be an easy task, but I can't put my finger on it.
How do I flag may AUv2 as needing its status saved (i.e. the host project needs saving)?
Hi!
I am writing a browser extension that allows you to control the playback of media content on a music service website. Unfortunately Safari does not support tracking changes to the audible property in an event tabs.onUpdated. Is there an alternative to this event? I'm looking for a way to track when the automatic inference engine interrupts playback on a music service website.
That you.
Does anyone know how to pronounce the sound of a specific instrument when you tap a button on the screen on your iPhone or iPad? Now, in the middle of creating a music learning app, I'm thinking of assigning monotones or chords to the button-like frames on the keyboard and fingerboard on the screen. Can it be achieved with SwiftUI chords alone? Once upon a time, MIDI level 1 I remember that there was a pronunciation function of the instrument, but I don't think about implementing the same function in the current OS. Please lend me your wisdom.
Topic:
Media Technologies
SubTopic:
Audio
I found that the aggregated device correctly obtains input channels in the standard microphone mode. However, in voice isolation mode, it only retrieves channels from the first sub-device in the aggregated device's list. If I want to properly obtain channel information in voice isolation mode, how should I do it?
My code that streams buffers into AVAudioPlayerNode is stuttering when the buffer is finished and before the next one is played.
while engine.isRunning {
let framesToCopy = min(buffer.frameLength - framePosition, Self.BufferSize)
let srcRaw = UnsafeRawPointer(srcPtr)
let playbackBuffer = AVAudioPCMBuffer(pcmFormat: buffer.format, frameCapacity: Self.BufferSize)!
let playbackPtr = playbackBuffer.floatChannelData![0]
let destRaw = UnsafeMutableRawPointer(mutating: playbackPtr)
memcpy(destRaw, srcRaw, Int(framesToCopy) * MemoryLayout<Float>.stride)
srcPtr = srcPtr.advanced(by: Int(framesToCopy))
playbackBuffer.frameLength = framesToCopy
await player.scheduleBuffer(playbackBuffer,
at: nil,
options: [],
completionCallbackType: .dataRendered)
}
I've tried to schedule multiple buffers at once using a combination of both the synchronous and async versions of scheduleBuffer because I thought the delay might be but it still stutters and the data copied into the playbackBuffer matches the source buffer. I've tried all combinations of options and completionCallbackType but no luck.
I've tried increasing the buffer size but that just spaces out the stutters because the buffer is larger.
What am I missing about this API?
Environment→ ・Device: iPad 10th generation ・OS:**iOS18.3.2
I'm using AVAudioSession to record sound in my application. But I recently came to realize that when the app starts a recording session on a tablet, OS automatically sets the tablet volume to 50% and when after recording ends, it doesn't change back to the previous volume level before starting the recording. So I would like to know whether this is an OS default behavior or a bug?
If it's a default behavior, I much appreciate if I can get a link to the documentation.
【溦N51888M】腾龙公司会员申请流程步骤【罔纸 211239.com 】输入官惘到浏览器打开联系24小时在线业务人员办理上下,打开公司官网.
二、点击主页右上角注册按钮.
三、填写账号信息.
四、输入手机号,验证码,密码.
五、勾选用户协议,完成注册协议,完成注册.
注意:若出现账号已存在」提示,需重新设置唯一账号名称
Topic:
Media Technologies
SubTopic:
Audio
Hey everyone,
I'm encountering an issue with audio sample rate conversion that I'm hoping someone can help with. Here's the breakdown:
Issue Description:
I've installed a tap on an input device to convert audio to an optimal sample rate.
There's a converter node added on top of this setup.
The problem arises when joining Zoom or FaceTime calls—the converter gets deallocated from memory, causing the program to crash.
Symptoms:
The converter node is being deallocated during video calls.
The program crashes entirely when this happens.
Traditional methods of monitoring sample rate changes (tracking nominal or actual sample rates) aren't working as expected.
The Big Challenge:
I can't figure out how to properly monitor sample rate changes.
Listeners set up to track these changes don't trigger when the device joins a Zoom or FaceTime call.
Please, if anyone has experience with this or knows a solution, I'd really appreciate your help. Thanks in advance!
Hi, I believe I've found a potential error in the sample code on the documentation page for creating and using a process tap with an aggregate device. The issue is in the section explaining how to add a tap to the aggregate device. I have already filed a Feedback Assistant ticket on this (ID: FB17411663) but haven't heard back for months.
Capturing system audio with Core Audio taps
The sample code for modifying the kAudioAggregateDevicePropertyTapList incorrectly uses the tapID as the target AudioObjectID when calling AudioObjectSetPropertyData.
// (Code to get the list and potentially modify listAsArray)
if var listAsArray = list as? [CFString] {
// ... (modification logic) ...
// Set the list back on the aggregate device. <--- The comment is correct
list = listAsArray as CFArray
_ = withUnsafeMutablePointer(to: &list) { list in
// INCORRECT: This call uses tapID as the target object.
AudioObjectSetPropertyData(tapID, &propertyAddress, 0, nil, propertySize, list)
}
}
The kAudioAggregateDevicePropertyTapList is a property that belongs to the aggregate device, not the individual tap. Therefore, to set this property, the AudioObjectSetPropertyData function must target the AudioObjectID of the aggregate device itself. Using tapID as the first argument is logically incorrect for this operation and will not update the aggregate device as intended.
Furthermore, the preceding AudioObjectGetPropertyData call to fetch the list also appears to use the incorrect tapID as its target in the sample.
The AudioObjectID for both getting and setting this property should be the ID of the aggregate device.
_ = AudioObjectGetPropertyData(aggregateDeviceID, &propertyAddress, 0, nil, &propertySize, &list)
_ = AudioObjectSetPropertyData(aggregateDeviceID, &propertyAddress, 0, nil, propertySize, newList)
Thank you!
I need to apply headphone-specific scenario only when headphones are the sole active playback device in my iOS audio app.
Problem that there is no absolute way to definitively understand that headphones are the sole active playback device
AVAudioSession.currentRoute.outputs portTypes don't guarantee headphones:
let session = AVAudioSession.sharedInstance()
let outputs = session.currentRoute.outputs
let headphonesOnly = outputs.count == 1 &&
(outputs.first?.portType == .headphones ||
outputs.first?.portType == .bluetoothA2DP ||
outputs.first?.portType == .bluetoothHFP ||
outputs.first?.portType == .bluetoothLE)
The issue in code above that listed bluetooth profiles (A2DP, HFP, LE) can be used by any audio device, not only headphones
Is there any public API on iOS that can:
Distinguish Bluetooth headphones vs Bluetooth speakers when both use A2DP/LE?
Expose the user’s “Device Type” classification (headphones / speaker / car stereo, etc.) that is shown in Settings → Bluetooth → Device Type?
Provide a more reliable way to know “this route is definitely headphones” for A2DP devices, beyond portType and portName string heuristics?
hi,
Is there an Audio Unit logo I can show on my website? I would love to show that my application is able to host Audio Unit plugins.
regards, Joël
I'm trying to use the new Speech framework for streaming transcription on macOS 26.3, and I can reproduce a failure with SpeechAnalyzer.start(inputSequence:).
What is working:
SpeechAnalyzer + SpeechTranscriber
offline path using start(inputAudioFile:finishAfterFile:)
same Spanish WAV file transcribes successfully and returns a coherent final result
What is not working:
SpeechAnalyzer + SpeechTranscriber
stream path using start(inputSequence:)
same WAV, replayed as AnalyzerInput(buffer:bufferStartTime:)
fails once replay starts with:
_GenericObjCError domain=Foundation._GenericObjCError code=0 detail=nilError
I also tried:
DictationTranscriber instead of SpeechTranscriber
no realtime pacing during replay
Both still fail in stream mode with the same error.
So this does not currently look like a ScreenCaptureKit issue or a Python integration issue. I reduced it to a pure Swift CLI repro.
Environment:
macOS 26.3 (25D122)
Xcode 26.3
Swift 6.2.4
Apple Silicon Mac
Has anyone here gotten SpeechAnalyzer.start(inputSequence:) working reliably on macOS 26.x?
If so, I'd be interested in any workaround or any detail that differs from the obvious setup:
prepareToAnalyze(in:)
bestAvailableAudioFormat(...)
AnalyzerInput(buffer:bufferStartTime:)
replaying a known-good WAV in chunks
I already filed Feedback Assistant:
FB22149971
Is there any way for me to use an AutoMix api in my IOS apps, I would play tracks using the Apple Music api and use AutoMix to attempt to merge tracks.
Is this feature/api available to developers.