In iOS 18, CarPlay shows an error: “There was a problem loading this content” after playback starts. Audio works fine, but the Now Playing screen doesn’t load. I’m using MPPlayableContentManager. This worked fine in iOS 17. Anyone else seeing this error in iOS 18?
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
We are encountering an issue where AVPlayer throws the error:
Error Domain=AVFoundationErrorDomain Code=-11819 "Cannot Complete Action" > Underlying Error Domain[(null)]:Code[0]:Desc[(null)]
This error seems to occur intermittently during video playback, especially after extended usage or when switching between different streams. We observe Error 11819 (AVFoundationErrorDomain) in Conviva platform that some of our users experience it but we couldn't reproduce it so far and we’re need support to determine the root cause and/or best practices to prevent it.
Some questions we have:
What typically triggers this error?
Could it be related to memory/resource constraints, network instability, or backgrounding?
Are there any recommended ways to handle or recover from this error gracefully?
Any insights or guidance would be greatly appreciated. Thanks!
Topic:
Media Technologies
SubTopic:
Streaming
I created a virtual audio device to capture system audio with a sample rate of 44.1 kHz. After capturing the audio, I forward it to the hardware sound card using AVAudioEngine, also with a sample rate of 44.1 kHz. However, due to the clock sources being unsynchronized, problems occur after a period of playback. How can I retrieve the clock source of the hardware device and set it for the virtual device?
Dear Sirs,
I'd like to add an icon to my audio driver based on AudioDriverKit. This icon should show up left of my audio device in the audio devices dialog. For an Audio Server Plugin I managed to do this using the property kAudioDevicePropertyIcon and CFBundleCopyResourceURL(...) but how would you do this with AudioDriverKit? Should I use IOUserAudioCustomProperty or IOUserAudioControl and how would I refer to the Bundle? Is there an example available somewhere?
Thanks and best regards,
Johannes
I'm reaching out regarding a recurring issue I'm experiencing with MusicKit developer tokens.
I'm using a valid .p8 private key to sign JWTs for Apple MusicKit integration. Each token I generate includes the appropriate claims (iss, iat, exp) and is signed with the ES256 algorithm, with an expiration date set approximately 6 months ahead.
Everything works as expected immediately after generating the token. However, after a few days, the same JWT (still well within its expiration period) suddenly begins returning invalid/unauthorized responses when used in Postman and other API clients.
Importantly:
I did not delete or revoke the .p8 key during this time.
I verified the JWT contains valid claims and a proper structure.
The issue consistently resolves only when I create a new .p8 file and regenerate a fresh JWT with it—after which the cycle repeats.
This issue occurs even when the environment and app identifiers remain unchanged.
I would greatly appreciate it if you could help me understand:
Why these tokens become invalid after a few days, despite having a long exp value and an unchanged key.
Whether there's any automatic revocation or timeout policy on .p8 keys that could explain this behavior.
If there's a better way to maintain long-lived developer tokens without requiring new .p8 key generation every few days.
Thank you for your help and clarification on this issue.
Best regards,
Liad Altif
Hello,
How do I find the apple music catalogID for songs in a apple music playlist?
Im building an iOS app that uses MusicKit/Apple Music API.
For this example, you can assume that my iOS app simply allows users to upload their apple music playlists. And when users open a specific playlist, I want to find the catalogID for each song.
Currently all the songs are returning song IDs in this format “i.PkdJvPXI2AJgm8”.
I believe these are libraryIDs, not catalogIDs.
I’d like to find a front end solution using MusicKit. But perhaps a back end solution using the Apple Music Rest API is required.
Any recommendations would be appreciated!
Since the last update to IOS 26.0 (23A5276f) the AirPods connect to my IPhone and the Audio is still running through the phone. They are shown in the Bluetooth Icon that they’re paired.
Topic:
Media Technologies
SubTopic:
Audio
Hi, I'm developing an application for macos and ios that has to run DetectHumanBodyPose3DRequest model in real time for retrieving the 3d skeleton from the camera.
I'm experiencing a memory leak every time the model is used (when i comment that line, the memory stays constant). After a minute it uses about 1GB of ram running with mac catalyst.
I attached a minimal project that has this problem
Code
Camera View
import SwiftUI
import Combine
import Vision
struct CameraView: View {
@StateObject private var viewModel = CameraViewModel()
var body: some View {
HStack {
ZStack {
GeometryReader { geometry in
if let image = viewModel.currentFrame {
Image(decorative: image, scale: 1)
.resizable()
.scaledToFill()
.frame(width: geometry.size.width,
height: geometry.size.height)
.clipped()
} else {
ProgressView()
}
}
}
}
}
}
class CameraViewModel: ObservableObject {
@Published var currentFrame: CGImage?
@Published var frameRate: Double = 0
@Published var currentVisionBodyPose: HumanBodyPose3DObservation? // Store current body pose
@Published var currentImageSize: CGSize? // Store current image size
private var cameraManager: CameraManager?
private var humanBodyPose = HumanBodyPose3DDetector()
private var lastClassificationTime = Date()
private var frameCount = 0
private var lastFrameTime = Date()
private let classificationThrottleInterval: TimeInterval = 1.0
private var lastPoseSendTime: Date = .distantPast
init() {
cameraManager = CameraManager()
startPreview()
startClassification()
}
private func startPreview() {
Task {
guard let previewStream = cameraManager?.previewStream else { return }
for await frame in previewStream {
let size = CGSize(width: frame.width, height: frame.height)
Task { @MainActor in
self.currentFrame = frame
self.currentImageSize = size
self.updateFrameRate()
}
}
}
}
private func startClassification() {
Task {
guard let classificationStream = cameraManager?.classificationStream else { return }
for await pixelBuffer in classificationStream {
self.classifyFrame(pixelBuffer: pixelBuffer)
}
}
}
private func classifyFrame(pixelBuffer: CVPixelBuffer) {
humanBodyPose.runHumanBodyPose3DRequestOnImage(pixelBuffer: pixelBuffer) { [weak self] observation in
guard let self = self else { return }
DispatchQueue.main.async {
if let observation = observation {
self.currentVisionBodyPose = observation
print(observation)
} else {
self.currentVisionBodyPose = nil
}
}
}
}
private func updateFrameRate() {
frameCount += 1
let now = Date()
let elapsed = now.timeIntervalSince(lastFrameTime)
if elapsed >= 1.0 {
frameRate = Double(frameCount) / elapsed
frameCount = 0
lastFrameTime = now
}
}
}
HumanBodyPose3DDetector
import Foundation
import Vision
class HumanBodyPose3DDetector: NSObject, ObservableObject {
@Published var humanObservation: HumanBodyPose3DObservation? = nil
private let queue = DispatchQueue(label: "humanbodypose.queue")
private let request = DetectHumanBodyPose3DRequest()
private struct SendablePixelBuffer: @unchecked Sendable {
let buffer: CVPixelBuffer
}
public func runHumanBodyPose3DRequestOnImage(pixelBuffer: CVPixelBuffer, completion: @escaping (HumanBodyPose3DObservation?) -> Void) {
let sendableBuffer = SendablePixelBuffer(buffer: pixelBuffer)
queue.async { [weak self] in
Task { [weak self, sendableBuffer] in
do {
guard let self = self else { return }
let result = try await self.request.perform(on: sendableBuffer.buffer)
//process result
DispatchQueue.main.async {
if result.isEmpty {
completion(nil)
} else {
completion(result[0])
}
}
} catch {
DispatchQueue.main.async {
completion(nil)
}
}
}
}
}
}
I have created an app where you can speak using SFSpeechRecognizer and it will recognize you speech into text, translate it and then return it back using speech synthesis. All locales for SFSpeechRecognizer and switching between them work fine when the app is in the foreground but after I turn off my screen(the app is still running I just turned off the screen) and try to create new recognitionTask it it receives this error inside the recognition task: User denied access to speech recognition. The weird thing about this is it only happens with some languages. The error happens with Croatian or Hungarian locale for speech recognition but doesn't with English or Spanish locale.
ImageIO encoding to HEICS fails in macOS 15.5.
log
writeImageAtIndex:1246: *** CMPhotoCompressionSessionAddImageToSequence: err = kCMPhotoError_UnsupportedOperation [-16994] (codec: 'hvc1')
seems to be related with
https://github.com/SDWebImage/SDWebImage/issues/3732
affected version
iOS 18.4 (sim and device), macOS 15.5
unaffected version
iOS 18.3 (sim and device), macOS 15.3
Hey everyone,
I'm encountering an issue with audio sample rate conversion that I'm hoping someone can help with. Here's the breakdown:
Issue Description:
I've installed a tap on an input device to convert audio to an optimal sample rate.
There's a converter node added on top of this setup.
The problem arises when joining Zoom or FaceTime calls—the converter gets deallocated from memory, causing the program to crash.
Symptoms:
The converter node is being deallocated during video calls.
The program crashes entirely when this happens.
Traditional methods of monitoring sample rate changes (tracking nominal or actual sample rates) aren't working as expected.
The Big Challenge:
I can't figure out how to properly monitor sample rate changes.
Listeners set up to track these changes don't trigger when the device joins a Zoom or FaceTime call.
Please, if anyone has experience with this or knows a solution, I'd really appreciate your help. Thanks in advance!
Use case: When SharePlay -ing a fully immersive 3D scene (e.g. a virtual stage), I would like to shine lights on specific Personas, so they show up brighter when someone in the scene is recording the feed (think a camera person in the scene wearing Vision Pro).
Note: This spotlight effect only needs to render in the camera person's headset and does NOT need to be journaled or shared.
Before I dive into this, my technical question: Can environmental and/or scene lighting affect Persona brightness in a SharePlay? If not, is there a way to programmatically make Personas "brighter" when recording?
My screen recordings always seem to turn out darker than what's rendered in environment, and manually adjusting the contrast tends to blow out the details in a Persona's face (especially in visionOS 26).
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files.
Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing.
I've stripped my class down to the following:
private var isAuthed = false
// I call this in a .task {} in my SwiftUI View
public func requestSpeechRecognizerPermission() {
SFSpeechRecognizer.requestAuthorization { authStatus in
Task {
self.isAuthed = authStatus == .authorized
}
}
}
public func transcribe(from url: URL) {
guard isAuthed else { return }
let locale = Locale(identifier: "en-US")
let recognizer = SFSpeechRecognizer(locale: locale)
let recognitionRequest = SFSpeechURLRecognitionRequest(url: url)
// the behaviour occurs whether I set this to true or not, I recently set
// it to true to see if it made a difference
recognizer?.supportsOnDeviceRecognition = true
recognitionRequest.shouldReportPartialResults = true
recognitionRequest.addsPunctuation = true
recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in
guard result != nil else { return }
if result!.isFinal {
//speechRecognitionMetadata is not nil
} else {
//speechRecognitionMetadata is nil
}
}
}
}
Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true.
The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values.
The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
When creating and activating AVPictureInPictureController on iPhone 11 running iOS 18.x, the main screen viewport unexpectedly changes from the native 375×812 logical points to 414×896 points (the size of a different device such as iPhone XR/11 Plus). Additionally, a separate UIWindow with zero CGRect is created. This causes UIKit layout calculations, especially related to the keyboard and safeAreaInsets, to malfunction and results in issues like keyboard clipping and general UI misalignment. The problem appears tied specifically to iOS 18+ behavior and does not manifest on earlier iOS versions or other devices consistently.
Steps to Reproduce:
Run the app on iPhone 11 with iOS 18.x (including the latest minor updates).
Instantiate and start an AVPictureInPictureController during app runtime.
Observe that UIScreen.main.bounds changes from 375×812 (native iPhone 11 size) to 414×896.
Notice a secondary UIWindow appears with frame CGRectZero.
Interact with text inputs that cause the keyboard to appear.
The keyboard is clipped or partially obscured, causing UI usability issues.
Layout elements dependent on UIScreen.main.bounds or safeAreaInsets behave incorrectly.
Expected Behavior:
UIScreen.main.bounds should not dynamically change upon PiP controller creation. The coordinate space and viewport should remain accurate to device native dimensions. The keyboard and UI layout should adjust properly with respect to the actual on-screen size and safe areas.
Actual Behavior:
UIScreen.main.bounds changes to 414×896 upon PiP controller creation on iPhone 11 iOS 18+.
A zero-rect secondary UIWindow is created, impacting layout queries.
Keyboard clipping and UI layout bugs occur.
The app viewport and coordinate space are inconsistent with device hardware.
(Note: this is part 1 of a 3 part posting. See Part 2 or Part 3)
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Camera & Photos.
WWDC25 Camera & Photos group lab ran for one hour at 6 PM PST on Tuesday June 10th, 2025
Introductory kick-off questions
Question 1
Tell us a little about the new AVFoundation Capture APIs we've made available in the new iOS 26 developer preview?
Cinematic Capture API (strong/weak focus, tracking focus)(scene monitoring)(simulated aperture)(dog/cat heads/groupIDs)
Camera Controls and AirPod Stem Clicks
Spatial Audio and Studio Quality AirPod Mics in Camera
Lens Smudge Detection
Exposure and Focus Rect of Interest
Question 2
I built QR code scanning into my app, but on newer iPhones I have to hold the phone very far away from the QR code, otherwise the image is blurry and it doesn't scan. Why is this happening and how can I fix it?
Every year, the cameras get better and better as we push the state of the art on iPhone photography and videography. This sometimes results in changes to the characteristics of the lenses.
min focus distance
newer phones have multiple lenses
automatic switching behavior
Use virtual device like the builtInDualWide or built in Triple, rather than just the builtInWide
Set the videoZoomFactor to 2. You're done.
Question 3
Last year, we saw some exciting new APIs introduced in AVFoundation in the health space. With Constant Color photography, developers can take pictures that have constant color regardless of ambient lighting. There are some further advancements this year. Davide, could you tell us about them?
constant color photography is mean to remove the "tone mapping" applied to photograph captured with camera app, usually incldsuing artistic intent, and instead try to be a close as possible to the real color of the scene, regardless of the illumination
constant color images could be captured in HEIF anf JPEG laste year. this year we are adding Support for the DICOM medical imaging photo format. It is a fomrat used by the health industry to store images related to medical subjects like MRI, skin problems, xray and so on.
It's writable and also readable format on all OS26, supported through AVCapturePhotoOutput APIs and through the coregraphics api.
for coregrapphics there is a new DICOM entry in the property dictionary which includes all the dicom availbale and defined propertie in a file. finder will also display all those in the info panel
(Address why a developer would want to use it) - not for regualr picture taking apps. for those HEIF and JPEG are the preferred delivery format. use dicom if your app produces output that are health related, that you can also share with health providers or your doctors
Main session developer questions
Question 1
LiDAR vs. Dual Camera depth generation: Which resolution does the LiDAR sensor natively have (iPhone 16 Pro) and when to prefer LiDAR over Dual Camera?
Both report formats with output resolutions (we don't advertise sensor resolution)
Lidar vs Dual, etc:
Lidar: Best for absolute depth, real world scale and computer vision
Dual, etc: relative, disparity-based, less power, photo effects
Also see: 2022 WWDC session "Discovery advancements in iOS camera capture: Depth, focus and multitasking"
Question 2
Can true depth and lidar camera run at 60fps?
Lidar can do 30fps (edited)
Front true depth can do 60fps.
Question 3
What’s the first class way to use PhotoKit to reimplement a high performance photo grid? We’ve been using a LazyVGrid and the photos caching manager, but are never able to hit the holy trinity (60hz, efficient memory footprint, minimal flashes of placeholder/empty cells)
use the PHCachingImageManager to get media content delivered before you need to display it
specify the size you need for grid sized display
set the options PHVideoRequestOptionsDeliveryModeFastFormat, PHImageRequestOptionsDeliveryModeFastFormat and PHImageRequestOptionsResizeModeFast
Question 4
For rending live preview of video stream, Is there performance overhead from using async and Swift UI for image updates vs UIViewRepresentable + AVCaptureVideoPreviewLayer.self?
AVCaptureVideoPreviewLayer is the most efficient display path
Use VDO + AVSampleBufferDisplayLayer if you need to modify the image data
Swift UI image is optimized for static image content
Question 5
Is there a way to configure the AVFoundation BuiltInLiDarDepthCamera mode to provide a depth map as accurate as ARKit at close range?
The AVCaptureDepthDataOutput supports filtering that reduces noise and fills in invalid values. Consider using this for smoother depth maps
Question 6
Pyramid-based photo editing in core image (such as adobe camera raw highlights and shadows)?
First off you may want to look a the builtin filter called CIHighlightShadowAdjust
Also the noise reduction in the CIRawFilter uses a pyramid-based algorithm.
You can also write your own pyramid-based algorithms by taking an input image:
down sample it by two multiply times using imageByApplyingAffineTransform
apply additional CIKernels to each downsampled image as needed.
use a custom CIKernel to combine the results.
Question 7
Is the best way to integrate an in-app camera for a “non-camera” app UIImagePickerController?
Yes, UIImagePickerController provides system-provided UI for capturing photos and movies.
Question 8
Hello, my question is on Deferred Photo Processing? Say I have a photo capture app that adds a CIFilter to the capture. How can I take advantage of Deferred Photo Processing? Since I don’t know how to detect when the deferred captured photo is ready
CIFilter can be called on the final at that point
Photo will have to be re-inserted into the Photo library as adjustment
Question 9
For shipping photo style assets in the app that need transparency what is the best format to use? JPEG2000? will moving to this save a lot of space comapred to PNG or other options?
If you want lossless compression PNG is good and supports unpremutiplied alpha
If you want lossy compression HEIF supports premutiplied or unpremutiplied alpha
(Note: this is part 1 of a 3 part posting. See Part 2 or Part 3)
TL;DR How to solve possible racing issue of EXT-X-SESSION-KEY request and encrypted media segment request?
I'm having trouble using custom AVAssetResourceLoaderDelegate with video manifest containing VideoProtectionKey(VPK). My master manifest contains rendition manifest url and VPK url. When not using custom resource delegate, everything works fine.
My custom resource delegate is implemented in way where it first append prefix to scheme of the master manifest url before creating the asset. And during handling master manifest, it puts back original scheme, make the request, modify the scheme for rendition manifest url in the response content by appending the same prefix again, so that rendition manifest request also goes into custom resource loader delegate. Same goes for VPK request. The AES-128 key is stored in memory within custom resource loader delegate object. So far so good.
The VPK is requested before segment request. But the problem comes where the media segment requests happen. The media segment request url from rendition manifest goes into custom resource loader as well and those are encrypted. I can see segment request finish first then the related VPK requests kick in after a few seconds. The previous VPK value is cached in memory so it is not network causing the delay but some mechanism that I'm not aware of causing this.
So could anyone tell me what would be the proper way of handling this situation? The native library is handling it well so I just want to know how. Thanks in advance!
I have an app (currently in development stage) which needs to use ffmpeg, so I tried searching how to embed ffmpeg in apple apps and found this article https://doc.qt.io/qt-6/qtmultimedia-building-ffmpeg-ios.html
It is working correctly for iOS but not for macOS ( I have made changes macOS specific using chatgpt and traditional web searching)
Drive link for the file and instructions which I'm following: https://drive.google.com/drive/folders/11wqlvb8SU2thMSfII4_Xm3Kc2fPSCZed?usp=share_link
Please can someone from apple or in general help me to figure out what I'm doing wrong?
I'm currently using ReplayKit for background screen recording, but I can't determine whether the screen is in landscape mode from the CMSampleBuffer. All other APIs for detecting screen orientation are foreground-based. What should I do?
We have the necessary background recording entitlements, and for many users... do not run into any issues.
However, there is a subset of users that routinely get recordings ending.. we have narrowed this down and believe it to be the work of the watch dog.
First we removed the entire view hierarchy when app is backgrounded. There is just 'Text("Recording")'
This got the CPU usage in profiler down to 0%. We saw massive improvements to recording success rate.
We walked away assuming that was enough. However we are still seeing the same sort of crashes. All in the background. We're using Observation to drive audio state changes to a Live Activity.
Are those Observations causing the problem? Why doesn't apple provide a better API to background audio? The internet is full of weird issues
https://stackoverflow.com/questions/76010213/why-is-my-react-native-app-sometimes-terminated-in-the-background-while-tracking
https://stackoverflow.com/questions/71656047/why-is-my-react-native-app-terminating-in-the-background-while-recording-ios-r
https://github.com/expo/expo/issues/16807
This is such a terrible user experience. And we have very little visibility into what is happening and why.
No where in apple documentation states that in order for background recording to work, the app can only be 'Text("Recording")'
It does not outline a CPU or memory threshold. It just kills us.
I am trying to stream audio from local filesystem.
For that, I am trying to use an AVAssetResourceLoaderDelegate for an AVURLAsset. However, Content-Length is not known at the start. To overcome this, I tried several methods:
Set content length as nil, in the AVAssetResourceLoadingContentInformationRequest
Set content length to -1, in the ContentInformationRequest
Both of these cause the AVPlayerItem to fail with an error.
I also tried setting Content-Length as INT_MAX, and setting a renewalDate = Date(timeIntervalSinceNow: 5). However, that seems to be buggy. Even after updating the Content-Length to the correct value (e.g. X bytes) and finishing that loading request, the resource loader keeps getting requests with requestedOffset = X with dataRequest.requestsAllDataToEndOfResource = true. These requests keep coming indefinitely, and as a result it seems that the next item in the queue does not get played. Also, .AVPlayerItemDidPlayToEndTime notification does not get called.
I wanted to check if this is an expected behavior or is there a bug in this implementation. Also, what is the recommended way to stream audio of unknown initial length from local file system?
Thanks!