Hello,
I am developing a visionOS application and am interested in obtaining detailed data of users’ hands through ARKit, including but not limited to Transform and rotation angle. I have reviewed Happy Beem, but it appears to only introduce the method of identifying the user’s specific gestures.
Could you please advise on how to obtain the Transform and rotation angle of the user’s hand?
Thank you.
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I downloaded the official sample project “Accessing the Main Camera”, but I found that it’s not able to retrieve the camera feed on visionOS 26.1. After checking the debug logs, it seems the issue is caused by the system being unable to find the expected format.
I tested on a device running visionOS 2, and the camera feed worked correctly — but only when using the sample code from the visionOS 2 version, not the current one. I also noticed that some of the APIs have changed between versions.
Has anyone managed to successfully access the camera feed on visionOS 26.1?
I want to implement the functions in this video, how should I set the window
I have a visionOS app where I instantiate ARKitSession and various providers (HandTrackingProvider and WorldTrackingProvider) in my appModel. That way, I can pass these providers to a Task which runs a gRPC server for sending the data from these providers to a client. When the users enters the immersive space of the app, the ARKitSession will run the providers if they are not running already.
I am now trying to implement the AccessoryTrackingProvider with the PSVR sense controllers but it does not fit with my current framework because the controllers may not be connected when the ARKitSession.run function is called. So I need to find a new place to start the session.
My question is, if I already have a session which is running the hand and world tracking providers, can I start another session to run the accessory tracking? Should they all be running on the same session?
Is there a way to stop the session and restart it when the controllers are connected? When I tried this, I get an error that says "It is not possible to re-run a stopped data provider (<ar_hand_tracking_provider_t: " but if I instantiate a new HandTrackingProvider, then the one that got passed to the gRPC task would no longer be the one running in the new session.
Any advice on how best to manage the various providers and ARKit sessions would be greatly appreciated.
SpatialEventGesture Not Working to Show Hidden Menu in Immersive Panorama View - visionOS
Problem Description
I'm developing a Vision Pro app that displays 360° panoramic photos in a full immersive space. I have a floating menu that auto-hides after 5 seconds, and I want users to be able to show the menu again using spatial gestures (particularly pinch gestures) when it's hidden.
However, the SpatialEventGesture implementation is not working as expected. The menu doesn't appear when users perform pinch gestures or other spatial interactions in the immersive space.
Current Implementation
Here's the relevant gesture detection code in my ImmersiveView:
import SwiftUI
import RealityKit
struct ImmersiveView: View {
@EnvironmentObject var appModel: AppModel
@Environment(\.openWindow) private var openWindow
var body: some View {
RealityView { content in
// RealityView content setup with panoramic sphere...
let rootEntity = Entity()
content.add(rootEntity)
// Load panoramic content here...
}
// Using SpatialEventGesture to handle multiple spatial gestures
.gesture(
SpatialEventGesture()
.onEnded { eventCollection in
// Check menu visibility state
if !appModel.isPanoramaMenuVisible {
// Iterate through event collection to handle various gestures
for event in eventCollection {
switch event.kind {
case .touch:
print("Detected spatial touch gesture, showing menu")
showMenuWithGesture()
return
case .indirectPinch:
print("Detected spatial pinch gesture, showing menu")
showMenuWithGesture()
return
case .pointer:
print("Detected spatial pointer gesture, showing menu")
showMenuWithGesture()
return
@unknown default:
print("Detected unknown spatial gesture: \(event.kind)")
showMenuWithGesture()
return
}
}
}
}
)
// Keep long press gesture as backup
.simultaneousGesture(
LongPressGesture(minimumDuration: 1.5)
.onEnded { _ in
if !appModel.isPanoramaMenuVisible {
print("Detected long press gesture, showing menu")
showMenuWithGesture()
}
}
)
}
private func showMenuWithGesture() {
if !appModel.isPanoramaMenuVisible {
appModel.showPanoramaMenu()
if !appModel.windowExists(id: "PanoramaMenu") {
openWindow(id: "PanoramaMenu", value: "menu")
}
}
}
}
What I've Tried
Multiple SpatialTapGesture approaches: Originally tried using multiple .gesture() modifiers with SpatialTapGesture(count: 1) and SpatialTapGesture(count: 2), but realized they override each other.
SpatialEventGesture implementation: Switched to SpatialEventGesture to handle multiple event types (.touch, .indirectPinch, .pointer), but pinch gestures still don't trigger the menu.
Added debugging: Console logs show that the gesture callbacks are never called when performing pinch gestures in the immersive space.
Backup LongPressGesture: Added a simultaneous long press gesture as backup, which also doesn't work consistently.
Expected Behavior
When the panorama menu is hidden (after 5-second auto-hide), users should be able to:
Perform a pinch gesture (indirect pinch) to show the menu
Tap in space to show the menu
Use other spatial gestures to show the menu
Questions
Is SpatialEventGesture the correct approach for detecting gestures in a full immersive RealityView?
Are there any special considerations for gesture detection when the RealityView contains a large panoramic sphere that might be intercepting gestures?
Should I be using a different gesture approach for visionOS immersive spaces?
Is there a way to ensure gestures work even when the RealityView content (panoramic sphere) might be blocking them?
Environment
Xcode 16.1
visionOS 2.5
Testing on Vision Pro device
App uses SwiftUI + RealityKit
Any guidance on the proper way to implement spatial gesture detection in visionOS immersive spaces would be greatly appreciated!
Additional Context
The app manages multiple windows and the gesture detection should work specifically when in the immersive panorama mode with the menu hidden.
Thank you for any help or suggestions!
As I understand it there are two ways I can track a hand, or a joint, in RealityKit:
either, create an AnchorEntity, for example AnchorEntity(.hand(.left, location: .palm))
or, set up an ARSession with a HandTrackingProvider ( a lot more code which I haven't repeated here).
Assuming this is correct, when would I want to use one over the other?
Hello,
I am experimenting with Unity to develop a mixed reality (MR) application for visionOS. I would like to understand the best approach for structuring my project:
Should I build the entire experience in Unity (both Windows and Volumes)?
Or is it better to create only certain elements (e.g., Volumes) in Unity while managing Windows separately in Xcode?
Also, how well do interactions (e.g pinch, grab…) created in Unity integrate with Xcode?
If I use the PolySpatial plugin, does that allow me to manage all interactions entirely within Unity, or would I still need to handle/integrate part of it in Xcode?
What's worked best for you? Please let me know if you have any recommendations, Thanks!
Topic:
Spatial Computing
SubTopic:
General
Tags:
Vision
Reality Composer Pro
visionOS
iPad and iOS apps on visionOS
Hi All,
We're a studio building an app and as part of a scene we have a 3D asset with a smoke particle emitter and a curved mesh that plays video. I notice that when the video alone is played or the particle effect alone is done then the scene works fine but the frame rate drops drastically when both are turned on.
How do I solve this because this is an important storytelling feature.
I am develop visionOS app. I am now very interested in Metal and Compositor Services, but I have not explored them in depth. I know that Metal has a higher degree of control freedom. I am wondering if using Compositor Services will have fewer functions than RealityKit in AR technology (such as scene reconstruction and understanding, hover effect, etc.).
Hi, I'm developing a virtual camera system using ReplayKit to capture scene video by directly accessing raw video buffers. The capture mechanism works flawlessly when repeatedly starting and stopping video capture within a continuous immersive environment. However, a critical issue arises when interrupting the immersive space:
Step 1: Enter immersive environment and start and stop capture videos(Multiple times with no issues)
Step 2: Press the crown button to exit the immersive environment
Step 3: Return to the immersive space subsequently
Step 4: Attempt to start the video capture
At this point, the startCapture method throws an unexpected error, disrupting the video capture workflow.
This is the Xcode error that I see " [ERROR] -[RPScreenRecorder startCaptureWithHandler:completionHandler:]_block_invoke_2:500 failed to start due to error: Error Domain=com.apple.ReplayKit.RPRecordingErrorDomain Code=-5803 "Recording failed to start" UserInfo={NSLocalizedDescription=Recording failed to start}"
I have tried all possible ways to stopCapture including OnDisappear and other methods and nothing seems to solve this.
Description:
I'm developing a travel/panorama viewing app for visionOS that allows users to view 360° panoramic images in an immersive space. When users enter panorama viewing mode, I want to provide a fully immersive experience where the main interface window and Earth 3D globe window are hidden.
I've implemented the app following Apple's documentation on Creating Fully Immersive Experiences, but when users enter the immersive space, both the main window and the Earth 3D window remain visible, diminishing the immersive experience.
Implementation Details:
My app has three main components:
A main content window showing panorama thumbnails
A 3D globe window (volumetric) showing locations
An immersive space for viewing 360° panoramas
I'm using .immersionStyle(selection: $panoImageView, in: .full) to create a fully immersive experience, but other windows remain visible.
Relevant Code:
@main
struct Travel_ImmersiveApp: App {
@StateObject private var appModel = AppModel()
@State private var panoImageView: ImmersionStyle = .full
var body: some Scene {
WindowGroup {
ContentView()
.environmentObject(appModel)
}
.windowStyle(.automatic)
.defaultSize(width: 1280, height: 825)
WindowGroup(id: "Earth") {
Globe3DView()
.environmentObject(appModel)
.onAppear {
appModel.isGlobeWindowOpen = true
appModel.globeWindowOpen = true
}
.onDisappear {
if !appModel.shouldCloseApp {
appModel.handleGlobeWindowClose()
}
}
}
.windowStyle(.volumetric)
.defaultSize(width: 0.8, height: 0.8, depth: 0.8, in: .meters)
.windowResizability(.contentSize)
ImmersiveSpace(id: "ImmersiveView") {
ImmersiveView()
.environmentObject(appModel)
}
.immersionStyle(selection: $panoImageView, in: .full)
}
}
Opening the Immersive Space:
func getPanoImageAndOpenImmersiveSpace() async {
appModel.clearMemoryCache()
do {
let canView = appModel.canViewImage(image)
if canView {
let downloadedImage = try await appModel.getPanoramaImage(for: image) { progress in
Task { @MainActor in
cardState = .loading(progress: progress)
}
}
await MainActor.run {
appModel.updateCurrentImage(image, panoramaImage: downloadedImage)
}
if !appModel.immersiveSpaceOpened {
try await openImmersiveSpace(id: "ImmersiveView")
await MainActor.run {
appModel.immersiveSpaceOpened = true
cardState = .normal
}
} else {
await MainActor.run {
appModel.updateImmersiveView = true
cardState = .normal
}
}
} else {
await MainActor.run {
appModel.errorMessage = "You do not have permission to view this image."
cardState = .normal
}
}
} catch {
// Error handling
}
}
Immersive View Implementation:
struct ImmersiveView: View {
@EnvironmentObject var appModel: AppModel
var body: some View {
RealityView { content in
let rootEntity = Entity()
content.add(rootEntity)
Task {
if let selectedImage = appModel.selectedImage,
appModel.canViewImage(selectedImage) {
await loadPanorama(for: rootEntity)
}
}
} update: { content in
if appModel.updateImmersiveView,
let selectedImage = appModel.selectedImage,
appModel.canViewImage(selectedImage),
let rootEntity = content.entities.first {
Task {
await loadPanorama(for: rootEntity)
appModel.updateImmersiveView = false
}
}
}
.onAppear {
print("ImmersiveView appeared")
}
.onDisappear {
appModel.resetImmersiveState()
}
}
// loadPanorama implementation...
}
What I've Tried
Set immersionStyle to .full as recommended in the documentation
Confirmed that the immersive space is properly opened and displaying panoramas
Verified that the state management for the immersive space is working correctly
Questions
How can I ensure that when the user enters the immersive panorama viewing experience, all other windows (main interface and Earth 3D globe) are automatically hidden?
Is there a specific API or approach I'm missing to properly implement a fully immersive experience that hides all other windows?
Do I need to manually dismiss the windows when opening the immersive space, and if so, what's the best approach for doing this?
Any guidance or sample code would be greatly appreciated. Thank you!
Hi I am trying to implement something simple as people can share their Spatial Photos with others (just like this post). I encountered the same issue with him, but his answer doesn't help me out here.
Briefly speaking, I am using CGImgaeSoruce to extract paired leftImage and rightImage from one fetched spatial photo
let photos = PHAsset.fetchAssets(with: .image, options: nil)
// enumerating photos ....
if asset.mediaSubtypes.contains(PHAssetMediaSubtype.spatialMedia) {
spatialAsset = asset
}
// other code show below
I can fetch left and right images from native Spatial Photo (taken by Apple Vision Pro or iPhone 15+), but it didn't work on generated spatial photo (2D -> 3D feat in Photos).
// imageCount is 1 when it comes to generated spatial photo
let imageCount = CGImageSourceGetCount(source)
I searched over the net and someone says the generated version is having a depth image instead of left/right pair. But still I cannot extract any depth image from imageSource.
The full code below, the imagePair extraction will stop at "no groups found":
func extractPairedImage(phAsset: PHAsset, completion: @escaping (StereoImagePair?) -> Void) {
let options = PHImageRequestOptions()
options.isNetworkAccessAllowed = true
options.deliveryMode = .highQualityFormat
options.resizeMode = .none
options.version = .original
return PHImageManager.default().requestImageDataAndOrientation(for: phAsset, options: options) {
imageData, _, _, _ in
guard let imageData,
let imageSource = CGImageSourceCreateWithData(imageData as CFData, nil)
else {
completion(nil)
return
}
let stereoImagePair = stereoImagePair(from: imageSource)
completion(stereoImagePair)
}
}
}
func stereoImagePair(from source: CGImageSource) -> StereoImagePair? {
guard let properties = CGImageSourceCopyProperties(source, nil) as? [CFString: Any] else {
return nil
}
let imageCount = CGImageSourceGetCount(source)
print(String(format: "%d images found", imageCount))
guard let groups = properties[kCGImagePropertyGroups] as? [[CFString: Any]] else {
/// function returns here
print("no groups found")
return nil
}
guard
let stereoGroup = groups.first(where: {
let groupType = $0[kCGImagePropertyGroupType] as! CFString
return groupType == kCGImagePropertyGroupTypeStereoPair
})
else {
return nil
}
guard let leftIndex = stereoGroup[kCGImagePropertyGroupImageIndexLeft] as? Int,
let rightIndex = stereoGroup[kCGImagePropertyGroupImageIndexRight] as? Int,
let leftImage = CGImageSourceCreateImageAtIndex(source, leftIndex, nil),
let rightImage = CGImageSourceCreateImageAtIndex(source, rightIndex, nil),
let leftProperties = CGImageSourceCopyPropertiesAtIndex(source, leftIndex, nil),
let rightProperties = CGImageSourceCopyPropertiesAtIndex(source, rightIndex, nil)
else {
return nil
}
return (leftImage, rightImage, self.identifier)
}
Any suggestion? Thanks
visionOS 2.4
My VisionOS App (Travel Immersive) has two interface windows: a main 2D interface window and a 3D Earth window. If the user first closes the main interface window and then the Earth window, clicking the app icon again will only launch the Earth window while failing to display the main interface window. However, if the user closes the Earth window first and then the main interface window, the app restarts normally.
Below is the code of
import SwiftUI
@main
struct Travel_ImmersiveApp: App {
@StateObject private var appModel = AppModel()
var body: some Scene {
WindowGroup(id: "MainWindow") {
ContentView()
.environmentObject(appModel)
.onDisappear {
appModel.closeEarthWindow = true
}
}
.windowStyle(.automatic)
.defaultSize(width: 1280, height: 825)
WindowGroup(id: "Earth") {
if !appModel.closeEarthWindow {
Globe3DView()
.environmentObject(appModel)
.onDisappear {
appModel.isGlobeWindowOpen = false
}
} else {
EmptyView() // 关闭时渲染空视图
}
}
.windowStyle(.volumetric)
.defaultSize(width: 0.8, height: 0.8, depth: 0.8, in: .meters)
ImmersiveSpace(id: "ImmersiveView") {
ImmersiveView()
.environmentObject(appModel)
}
}
}
This modifier in visionOS 2.5 works perfectly with LazyVgrid inside a Stack in ScrollView:
.hoverEffect { effect, isActive, _ in
effect.scaleEffect(isActive ? 1.1 : 1.0)
But the grid does not scroll in visionOS 26 beta 1 unless the scaleEffect is commented out.
FB17941468
With Xcode 26, loading ressources with RealityKit is extremely slow.
Here my project takes almost 50 seconds to load.
I also get multiple Hang detected messages in the console:
When I uncheck "Debug executable" in the schema, the same project loads in 2 seconds.
I'm using RealityKit asynchronous loading:
private static func loadFromRealityComposerPro(
named entityName: String,
fromSceneNamed sceneName: String
) async -> Entity? {
var entity: Entity?
do {
let scene = try await Entity(
named: sceneName,
in: visionPetsContentBundle
)
entity = scene.findEntity(named: entityName)
} catch {
print(
"Error loading \(entityName) from scene \(sceneName): \(error.localizedDescription)"
)
}
return entity
}
Anyone having the same problem?
Topic:
Spatial Computing
SubTopic:
General
Can an app made with the Room Plan API be used on iPhones without LIDAR? If so, how much accuracy would be lost compared to iPhones with LIDAR?
If not, is there an API similar to RoomPlan that works on iPhones without LiDAR?
The Section struct only publicly makes the center property available, but this is a SIMD3 that doesn't seem to line up with the rest of the model. All other objects have a 4x4 transform matrix that accurately gives each position and rotation.
When inspecting a Section in the debugger, many more properties are visible such as polygon and transform. Why are these not visible? The transform in particular seems necessary to make any sort of use of the Sections.
I have a grpc server running inside of a task. When the user takes the headset off, the grpc server will no longer work when they put the headset back on.
I would like to have this action detected so that I can cancel the task (which will effectively close the grpc server).
I am also using a visual indicator to let the user know if the server is running, but it will not accurately reflect the state of the server when removing and putting back on the headset.
Topic:
Spatial Computing
SubTopic:
General
I want to make a model with added bones move by dragging it with gestures,This is a model exported using Blender,
What I understand is using IKComponent, but I don't know how to use it specifically
We were having an issue wrb the system rotate and scale gestures (two-handed gestures / RotateGesture3D and MagnifyGesture) were extremely difficult to register (make work) in the visionOS simulator.
The solution we found was to:
Launch your app in the simulator
Move the pointer on top of the 3D object for which you are testing rotation and scaling gestures.
Press and hold the Option key to display touch points (ie: the two-handed gesture points).
While maintaining the option key pressed, release the pointer and re-enable it again. I am using a track pad with tap-to-click enabled and three-finger to drag enabled in accessibility, so "release the pointer and re-enable it again" translates simply to removing the three finger and placing them again on the trackpad.
If you have maintained the option key pressed, then you should now be able to rotate and scale the 3D object.
Context if you are interested:
Our issue was also occurring in Apple's own sample project relating to gestures "Transforming RealityKit entities using gestures", at below link.
On Apple's article "Interacting with your app in the visionOS simulator" at the below link, for two-handed gestures it states "Press and hold the Option key to display touch points. Move the pointer while pressing the Option key to change the distance between the touch points. Move the pointer and hold the Shift and Option keys to reposition the touch points."
This simply did not work anymore for rotation and scaling gestures.
These gestures used to be a lot more responsive in Sonoma. Either the article should be updated to what I described above, or there is an issue. Our colleague who is using macOS Sonoma 14.6.1 with the latest release of Xcode is not having these issues.
Here is the list of configurations (troubleshooting we tried!) where it is difficult to achieve rotation and scaling gestures in the visionOS simulator:
macOS Sequoia 16.1 Beta, Xcode 16.1 RC w visionOS 2.1
macOS Sequoia 16.1 Beta, Xcode 16.1 RC w visionOS 2.0
macOS Sequoia 16.1 Beta, Xcode 16.2 Beta 1 w visionOS 2.1
macOS Sequoia 16.1 Beta, Xcode 16.2 Beta 1 w visionOS 2.0
macOS Sequoia 16.1 Beta, remove all Xcodes and installed the build from AppStore (Xcode 16.1)
macOS Sequoia 16.1 Beta, Xcode 16.0 w visionOS 2.0
completely wiped out, and reset entire development machine, re-installed latest releases of sequoia (15.1) and xcode (15.1))
Throughout these troubleshooting I often:
restarted both xcode and sim
erased all derived data
erased all contents and settings from sims
performed fresh git clones
None of the above worked, only the workaround described above works atm. As you can maybe deduce, it was very time consuming to find the workaround, we also wasted some development effort thinking our gesture development was no-good.
Hopefully this will help other devs.
Article Link:
https://developer.apple.com/documentation/xcode/interacting-with-your-app-in-the-visionos-simulator
Gesture sample project link:
https://developer.apple.com/documentation/realitykit/transforming-realitykit-entities-with-gestures