Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

WideFOV - APMP - Stereo
Does anyone have a template of an Apple Projected Media Profile Format Description or a File of a Stereo wideFOV video? Use case I have 2 compatible cameras that I stereo sync and I want to move the projection information from the compatible video to the Spatial video that combines them. Every version I can come up with crashes the AVP and when viewing as Spatial in Tahoe I just get a black screen.
4
0
241
Jun ’25
Raycasting VNFaceLandmarkRegion2D
Hello, Does anyone have a recipe on how to raycast VNFaceLandmarkRegion2D points obtained from a frame's capturedImage? More specifically, how to construct the "from" parameter of the frame's raycastQuery from a VNFaceLandmarkRegion2D point? Do the points need to be flipped vertically? Is there any other transformation that needs to be performed on the points prior to passing them to raycastQuery?
4
0
320
Sep ’25
How to reduce CMSampleBuffer volume
Hello, Basically, I am reading and writing an asset. To simplify, I am just reading the asset and rewriting it into an output video without any modifications. However, I want to add a fade-out effect to the last three seconds of the output video. I don’t know how to do this. So far, before adding the CMSampleBuffer to the output video, I tried reducing its volume using an extension on CMSampleBuffer. In the extension, I passed 0.4 for testing, aiming to reduce the video's overall volume by 60%. My question is: How can I directly adjust the volume of a CMSampleBuffer? Here is the extension: extension CMSampleBuffer { func adjustVolume(by factor: Float) -> CMSampleBuffer? { guard let blockBuffer = CMSampleBufferGetDataBuffer(self) else { return nil } var length = 0 var dataPointer: UnsafeMutablePointer<Int8>? guard CMBlockBufferGetDataPointer(blockBuffer, atOffset: 0, lengthAtOffsetOut: nil, totalLengthOut: &length, dataPointerOut: &dataPointer) == kCMBlockBufferNoErr else { return nil } guard let dataPointer = dataPointer else { return nil } let sampleCount = length / MemoryLayout<Int16>.size dataPointer.withMemoryRebound(to: Int16.self, capacity: sampleCount) { pointer in for i in 0..<sampleCount { let sample = Float(pointer[i]) pointer[i] = Int16(sample * factor) } } return self } }
4
0
447
May ’25
On iOS 18, Mandarin is read aloud as Cantonese
Please include the line below in follow-up emails for this request. Case-ID: 11089799 When using AVSpeechUtterance and setting it to play in Mandarin, if Siri is set to Cantonese on iOS 18, it will be played in Cantonese. There is no such issue on iOS 17 and 16. 1.let utterance = AVSpeechUtterance(string: textView.text) let voice = AVSpeechSynthesisVoice(language: "zh-CN") utterance.voice = voice 2.In the phone settings, Siri is set to Cantonese
4
1
705
2d
How to use the SpeechDetector Module
I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule') Below is how I am using it let speechDetector = Speech.SpeechDetector() let transcriber = SpeechTranscriber(locale: Locale.current, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange]) speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])
4
2
455
Aug ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
4
0
472
Aug ’25
VTFrameRateConversionConfiguration don't support 640x480
hello, I'm using VideoTololbox VTFrameRateConversionConfiguration to perform frame interpolation: https://developer.apple.com/documentation/videotoolbox/vtframerateconversionconfiguration?language=objc ,when using 640x480 vidoe input, I got error: Error ! Invalid configuration [VEEspressoModel] build failure : flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480 [EpsressoModel] Cannot load Net file flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480 Error: failed to create FRCFlowAdaptationFeatureExtractor for usage 8 Failed to switch (0x12c40e140) [usage:8, 1/4 flow:0, adaptation layer:1, twoStage:0, revision:2, flow size (320x240)]. Could not init FlowAdaptation initFlowAdaptationWithError fail tried 2048x1080 is ok.
3
0
405
Dec ’25
Photos are captured with incorrect exposure bias in specific scenarios on iPhone 17 Pro
Hey, There seems to be an inconsistency when capturing a photo using QualityPrioritization.Quality on the iPhone 17 Pro Main wide Lens. If you zoom above "2x" the output image always has "-2.0ev" bias in the meta data and looks underexposued. This does not happen at zoom levels above 2, or if you set the QualityPrioritization to .Balanced. See below: with .Quality with .Balanced This does not happen on the other lenses. I'm using a simple set up and it is consistent across JPEG and ProRAW capture. I have a demo project if that is useful. Thanks, Alex
3
0
531
Dec ’25
Save MPEG-TS (h264 or HEVC) video stream using AVAssetWriter.
I'm capturing video stream from GoPro camera (I demux UDP MPEG-TS packets) and create CMSampleBuffers from them, this works fine when I display them using CMSampleBufferLayer. However when I dump them to disk using AVAssetWriter and then playback it with AVPlayer, AVPlayer has problems with scrubbing, it also cannot render previous frames, it needs to go back to key frames. Also thumbnails generated with AVAssetImageGenerator are mostly distorted and green, even though I set the requestedTimeToleranceAfter longer than the key frames frequency. When I re-encode saved video once again with AVAssetExportSession and play it back then I can scrub the video just fine. Is it because re-transcoding adds additional metadata to enable generating frames when rewinding the video and scrubbing? If so is there a way to achieve it with AVAssetWriter without much time penalty? I need the dump/save operation to be very fast. I also considered the following: Instead of de-muxing video and creating CMSampleBuffers, maybe I could directly dump the stream to disk and somehow add moov atoms with timing information. Would this approach work? If so where I can find information how to do it? Thank you!
3
0
174
Apr ’25
How to transcode HEIC to JPEG while preserving the ISO 21496-1 gain map
When shooting with an iPhone 15 or later, it’s possible to capture HEIC or JPEG images that include gain map information conforming to the ISO 21496-1 standard. However, during image format transcoding, the HEIC codec is able to preserve the ISO 21496-1 gain map. But when converting from HEIC to JPEG, the gain map is transformed into the Apple Gain Map format instead. Is there any solution to this issue?
3
0
1.2k
Jul ’25
PDF Page Content Swapping on iOS 26
Dear Apple Developer Team, On iOS 26, the contents of PDF pages appear to be swapped. Could you please advise if there is a workaround or a planned fix for this issue? Steps to Reproduce: Download the attached PDF on iOS 26. Open the PDF in the Files app. Tap the PDF to view it in Quick Look. Navigate to page 5. Expected Result: The page number displayed at the bottom should be 5. Actual Result: The page number displayed at the bottom is 4. Issue: This is not limited to page 5—multiple page contents appear to be swapped. I have also submitted feedback via Feedback Assistant (FB20743531) on October 20. Best regards, Yoshihito Suezawa
3
0
398
Nov ’25
AVAudioEngine Voice Processing Fails with Mismatched Input/Output Devices: AggregateDevice Channel Count Mismatch
I'm encountering errors while using AVAudioEngine with voice processing enabled (setVoiceProcessingEnabled(true)) in scenarios where the input and output audio devices are not the same. This issue arises specifically with mismatched devices, preventing the application from functioning as expected. Works: Paired devices (e.g., MacBook Pro mic → MacBook Pro speakers) Fails: Mismatched devices (e.g., AirPods mic → MacBook Pro speakers) When using paired input and output devices: The setup works as expected. Example: MacBook Pro microphone → MacBook Pro speakers. When using mismatched devices: AVAudioEngine setup fails during aggregate device construction. Example: AirPods microphone → MacBook Pro speakers. Error logs indicate a channel count mismatch. Here are the partial logs. Due to the content limit, I cannot post the entire logs. AUVPAggregate.cpp:1000 client-side input and output formats do not match (err=-10875) AUVPAggregate.cpp:1036 err=-10875 AVAEInternal.h:109 [AVAudioEngineGraph.mm:1344:Initialize: (err = PerformCommand(*outputNode, kAUInitialize, NULL, 0)): error -10875 AggregateDevice.mm:329 Failed expectation of constructed aggregate (312): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (312): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) AggregateDevice.mm:182 error fetching default pair AggregateDevice.mm:329 Failed expectation of constructed aggregate (336): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (336): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) AUHAL.cpp:1782 ca_verify_noerr: [AudioDeviceSetProperty(mDeviceID, NULL, 0, isInput, kAudioDevicePropertyIOProcStreamUsage, theSize, theStreamUsage), 560227702] AudioHardware-mac-imp.cpp:3484 AudioDeviceSetProperty: no device with given ID AUHAL.cpp:1782 ca_verify_noerr: [AudioDeviceSetProperty(mDeviceID, NULL, 0, isInput, kAudioDevicePropertyIOProcStreamUsage, theSize, theStreamUsage), 560227702] AggregateDevice.mm:182 error fetching default pair AggregateDevice.mm:329 Failed expectation of constructed aggregate (348): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (348): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) Is it possible to use voice processing with different input/output devices? If yes, are there any specific configurations required to handle mismatched devices? How can we resolve channel count mismatch errors during aggregate device construction? Are there settings or API adjustments to enforce compatibility between input/output devices? Are there any workarounds or alternative approaches to achieve voice processing functionality with mismatched devices? For instance, can we force an intermediate channel configuration or downmix input/output formats?
3
2
800
Mar ’25
Launch The Main App from LockedCameraCapture
If the app is launched from LockedCameraCapture and if the settings button is tapped, I need to launch the main app. CameraViewController: func settingsButtonTapped() { #if isLockedCameraCaptureExtension //App is launched from Lock Screen //Launch main app here... #else //App is launched from Home Screen self.showSettings(animated: true) #endif } In this document: https://developer.apple.com/documentation/lockedcameracapture/creating-a-camera-experience-for-the-lock-screen Apple asks you to use: func launchApp(with session: LockedCameraCaptureSession, info: String) { Task { do { let activity = NSUserActivityTypeLockedCameraCapture activity.userInfo = [UserInfoKey: info] try await session.openApplication(for: activity) } catch { StatusManager.displayError("Unable to open app - \(error.localizedDescription)") } } } However, the documentation states that this should be placed within the extension code - LockedCameraCapture. If I do that, how can I call that all the way down from the main app's CameraViewController?
3
0
567
Nov ’25
Why Does AVCaptureSessionInterruptionReasonVideoDeviceNotAvailableWithMultipleForegroundApps Occur on iPhone?
Hi everyone, We're encountering an unexpected issue with our iPhone-only camera app: 👉 TimeMark - Photo Proof https://apps.apple.com/us/app/timemark-photo-proof/id6446071834 Problem Description: Our app uses a full-screen camera view via AVCaptureSession. In some cases reported by users, the camera fails immediately upon app launch, and we receive this interruption reason: AVCaptureSessionInterruptionReasonVideoDeviceNotAvailableWithMultipleForegroundApps According to the Apple documentation https://developer.apple.com/documentation/avfoundation/avcapturesession/interruptionreason/videodevicenotavailablewithmultipleforegroundapps?language=objc , this interruption typically occurs when the app is running in a multi-app layout such as Slide Over, Split View, or Picture in Picture — all of which are iPad-only features. However, this issue is being reported on iPhones, and our app does not support iPad at all. Also noted in the documentation: "Given your present AVCaptureSession configuration, the session may only be run if your app occupies the full screen." Additional Context: The issue occurs immediately on app launch, before the user can interact with the camera. We don’t enable multitaskingCameraAccessEnabled. We are 100% sure this is happening on iPhone, not iPad. It’s hard to reproduce; users report it happening sporadically. Locally, we tried playing Picture-in-Picture videos (e.g., Safari/YouTube) before launching our app, but we could not reproduce the issue. Questions: Why is this interruption reason occurring on iPhone, which doesn’t officially support Slide Over or Split View? Could this be caused by some system-level multitasking or resource contention (e.g., Picture in Picture from FaceTime or Safari)? Would enabling multitaskingCameraAccessEnabled help prevent this issue on iPhone, even though it's designed for iPad? Enabling multitaskingCameraAccessEnabled seems to require enabling UIBackgroundModes → voip. Would adding this background mode cause any App Store review risk or rejection if our app doesn't actually use VoIP functionality? Any help, insight, or suggestions would be greatly appreciated. Thanks in advance!
3
0
865
Oct ’25
AVAudioRecorder records silence
We have application using PTT Framework to record audio messages when app is backgrounded. Right now we are using AVAudioRecorder for that purpose. And problem is one specific user has frequent issue - recorded audio contains only silence. I've checked almost everything I can imagine but didn't find any possible reason of issue. Conditions: AVAudioRecorder uses following configuration: [ AVEncoderAudioQualityKey: AVAudioQuality.low.rawValue, AVFormatIDKey : kAudioFormatMPEG4AAC, AVNumberOfChannelsKey: 1, AVSampleRateKey: 16000.0 ] App waits both didBeginTransmitting and didActivate audioSession from PTChannelManager (audio session has playback category at that moment) App does AVAudioSession category change to playAndRecord App gets routeChangeNotification with categoryChange and category = playAndRecord There is no any interruption notifications from AVAudioSession during recording There is no any error notification from AVAudioRecorder Any idea what exactly I do wrong? Is there anything else I should check? Thanks in advance. P.S. it looks like recording audio with AudioUnit has the same issue, but let's exclude it from question atm for simplicity.
3
0
435
Mar ’25
Telephoto Lens Keeps Switching to Other Lenses on iPhone 16 Pro Max During PPG (Finger on Camera)
Hi, I’m building a PPG-based heart rate feature where the user places their finger over the rear telephoto camera. On iPhone 16 Pro Max, I'm explicitly selecting the telephoto lens like this: videoDevice = AVCaptureDevice.default(.builtInTelephotoCamera, for: .video, position: .back) And trying to lock it: if #available(iOS 15.0, *), device.activePrimaryConstituentDeviceSwitchingBehavior != .unsupported { try? device.lockForConfiguration() device.setPrimaryConstituentDeviceSwitchingBehavior(.locked, restrictedSwitchingBehaviorConditions: []) device.unlockForConfiguration() } I also lock everything else to prevent dynamic changes: try device.lockForConfiguration() device.focusMode = .locked device.exposureMode = .locked device.whiteBalanceMode = .locked device.videoZoomFactor = 1.0 device.automaticallyEnablesLowLightBoostWhenAvailable = false device.automaticallyAdjustsVideoHDREnabled = false device.unlockForConfiguration() Despite this, the camera still switches to another lens, especially under different lighting, even though the user’s finger fully covers the lens. Questions: How can I completely prevent lens switching in this scenario? Would using videoZoomFactor = 3.0 or 5.0 better enforce use of the telephoto lens? Thanks! Gal
3
0
183
Jul ’25
Playback Issues for DRM content when sending CMCD
Since iOS and tvOS 18, CMCD can now be automatically sent by AVPlayer (https://developer.apple.com/streaming/Whats-new-HLS.pdf). However, after enabling CMCD, our streams occasionally fail with the following error: CoreMediaErrorDomain Error -17383 This issue appears to affect only DRM-protected (FairPlay) streams so far. We activate CMCD via the resource loader of an AVURLAsset, before assigning the item to an AVPlayer. Unfortunately, we haven’t found a reliable way to reproduce the issue, and we’ve been unable to gather any useful diagnostic information. Has anyone else observed this behavior when enabling CMCD on FairPlay streams?
3
0
520
Oct ’25