Skip to main content Michael Luo

Playing QB in VR With a Real Ball

(also available in video essay format)

I love American football. I love watching Saquon Barkley hurdle a man backwards. I love watching Lamar Jackson stiff-arm a 250lb defensive end as he makes an off-platform touchdown throw.

It kind of makes me want to play, until I remember how much I enjoy my ligaments and ribs.

As a 25-year old man who can at best be described as moderately athletic, I have to accept my chances at a Natty or a Super Bowl ring are miniscule.

Thankfully, I have the next best thing: a degree in Computer Science and lots of free time on my hands.

Football without the injuries?

Isn’t the whole point of football the physicality? Well, there’s one position that never really initiates contact, and the play usually just ends when they get hit.

Luckily for us, it’s also the position that every little kid (and middle aged man) dreams of playing. It also, uhhh, doesn’t require the same athleticism as other positions (sidenote: They are obviously still incredible athletes! ) .

labeling
Quarterbacking: you don't have to run if you don't want to.

We’re going to put on a VR headset that puts us on a field with 21 other virtual players, and have some cameras track a real ball that we throw to our virtual receivers. If a virtual 250lb defensive end sacks us, the play ends but our collarbone and ribs remain intact.

labeling
Good: body remains whole. Bad: look like a buffoon in public.

This leaves us with a lot of gameplay: selecting plays, making reads, throwing the football (!!), navigating the pocket, and the occasional scramble or QB run.

Would something like this be fun? Let’s build it and find out.

The hardware

labeling
A feat of engineering.

There’s an obvious choice right now for VR headsets: the Meta Quest 3. The $50 billion that our friends at Meta have invested (sidenote: I wish they spent more of that on improving the developer experience! Sometimes the headset would crash or screen tear during dev and give me a headache. Stepping away from the computer when that happened was a huge drain on productivity. ) has created nothing short of a technological marvel (though not a commercial success). If you time-traveled back to 1990 and showed someone ChatGPT or the Meta Quest 3, I couldn’t say which one they would be more impressed by.

Aside from the high-resolution, high-FOV pancake lenses and plentiful compute, the Quest API offers some super cool features that could make our sim much more immersive:

  • hand tracking: we can start plays off with a clap cadence (sidenote: When the QB snaps the ball by clapping (as opposed to a verbal cue). Mostly used in college football. ) or hand the ball off by shoving the football into the running back’s chest
  • room-scale tracking: being able to move freely lets us navigate the pocket, scramble, and run the ball ourselves
  • arm tracking (experimental): this lets us pump fake or stiff arm defenders
  • voice recognition: audibles! (sidenote: If a QB doesn't like what they see, they can audible to tweak the play or change to an entirely different one. )

We’re also going to need a space to play and some sort of tracking setup for the ball. Optimally, I would have liked to set up the equivalent of a batting cage where the physical ball gets thrown into a net but the virtual ball continues on in VR. Unfortunately, I’m doing this in San Francisco, and my backyard is on a slope! I’ll have to do this at the local park, which constrains my tracking setup to something that’s portable and easy to set up.

labeling
Sure is a prototype!

I chose to use an NVIDIA Orin Nano — a system-on-chip (SoC) about the size of my palm with 40 TOPS of compute. The camera is a StereoLabs Zed X stereo camera, which has a global shutter — ideal for capturing fast-moving objects like a football — and a mechanism to synchronize multiple cameras, in case I find that just one isn’t enough.

Both are small and portable and I can fit it on an old bike mechanic stand I have. I bring along a portable battery to power everything, as well as a router so the Orin Nano and Meta Quest can communicate with each other.

There are obvious improvements to this setup, but I was cognizant of having to set it up and tear it down every time I wanted to test/play. So I opted to start as simply as possible.

The software

Making a game

I decided to use Unreal Engine because I had plenty of C++ experience at my old job, and the Quest seemed like it had first class support for it. I did not have any Unreal experience when I started (other than messing around with UE3 as a child).

I speedran through Tom Looman’s Unreal course in a couple weeks. It was taught as an elective at Stanford so it goes at a speed and with a degree of rigor other courses might not. I highly recommend it for anyone with prior programming experience trying to learn Unreal.

labeling
College Football 25, this is not.

My main goal in making this was finding out whether the core concept would be fun. As such, the game is extraordinarily unpolished. For example, the players don’t have arms or legs! It would have taken ages to rig up animations of players jumping, twisting or laying out for the ball. Instead, the ball just snaps to the arm socket of the receivers or defensive backs when it enters their catch radius.

Left: handing the ball off. Right: pulling the ball to throw (not yet hooked up to ball tracking)

The rest of the game is similarly barebones. The defense randomly chooses between man coverage, a basic Cover 2 zone look, and blitzing. There are only ~10 offensive plays total, mostly designed to showcase different plays our simulator is capable of. For example, we can run RPOs by using the hand tracking to determine whether or not you pulled the ball from the running back.

Confetti when you score a touchdown is about the only "polish" there is. Mostly just wanted to mess around with the particle effects system.

There’s nothing terribly technically interesting about the game, but it did feel much more “creative” than “normal” software development. For example, to detect the clap cadence, I attached invisible blocks to the palms of each hand. When the left hand block detects a collision with the right hand block, it sends a signal to start the play. When the running back gets the ball, they find the hole in the line by generating a bunch of points in a cone in front of them and then finding the point that’s furthest from any opposing player.

A lot of the game is based on simple geometric logic like this. I budgeted about 3 weeks to make the game, so maybe the creativity I felt was more about finding fun tricks to quickly approximate the behavior I wanted, instead of spending a bunch of time implementing more formal methods.

Tracking the ball

Our camera essentially gives us a stream of timestamped RGB image pairs. For each image pair, we can calculate a depth map. Then we run an object detection model to find the location of the football in the RGB image. The corresponding pixels in the depth map gives us the position of the football relative to the camera. As long as we capture some part of the trajectory, we then fit these (timestamp, position) points into a simple ballistics model.

As pseudocode:

code snippet start

  ball_positions: Array[Vector3d] = []
  timestamps: Array[Time] = []

  for timestamp, img_left, img_right in camera_stream:
    depth_map = calculate_disparity(img_left, img_right)
    center_football = detect_football(img_left)
    ball_pos_irl = calculate_ball_pos(center_football, depth_map)
    timestamps.append(timestamp)
    ball_positions.append(ball_pos_irl)

  trajectory = fit_ballistics_model(timestamps, ball_positions)

code snippet end

labeling
Tossing the ball in the park to get data

Being able to detect where a football is in an image used to be extremely hard but nowadays is trivial. I spent a couple days recording myself tossing and throwing the football around, and (auto-)labeled a couple thousand images. Using these labeled images, I finetuned a tiny object detection model - YOLOv8 Nano, which the Orin Nano is more than capable of running.

Stereo depth perception is a well understood enough problem that the StereoLabs camera SDK comes with several APIs to get a depth map for you. I chose to run a neural depth model because it was much more accurate. Now we just want to run the pipeline as fast as possible. I converted all my models to TensorRT and ran as much in parallel as I could.

Slower method:

code snippet start

while True:
  left_img, right_img = grab_camera_img()
  depth_map = compute_depth_map(left_img)
  football_bounding_box = yolov8(right_img)
  estimate_position(football_bounding_box, depth_map)

  if enough_position_points_captured:
    calculate_trajectory()
    emit_trajectory_to_meta_quest()

code snippet end

Faster method:

code snippet start

# in thread 1
while True:
  left_img, right_img = grab_camera_img()
  depth_map = compute_depth_map(left_img)
  yolo_queue.push({left_img, depth_map})

# in thread 2:
while left_img, depth_map = yolo_queue.pop():
  football_bounding_box = yolov8(left_img)
  trajectory_queue.push({football_bounding_box, depth_map})

# in thread 3:
while bb, depth_map = trajectory_queue.pop():
  football_position = estimate_position(bb, depth_map)
  ...
  if enough_position_points_captured:
    calculate_trajectory()
    emit_trajectory_to_meta_quest()

code snippet end

The tiny YOLOv8 Nano model ran much faster than the neural depth model, so there was no risk of lag from the YOLOv8 thread running behind.

After we gather enough frames with the ball to estimate a trajectory, we send a message to the Meta Quest so it spawns a ball with those ballistics model parameters.

Putting it together

To be clear, the simulator kinda sucks. I had a very hard deadline of 5 weeks to assemble all the hardware and write all the code (+3 weeks to learn Unreal). There’s unresolved annoyances with aligning the coordinate systems of the tracking and the headset. The ball trajectory is often off by a couple degrees and several miles/hour. The game only merely resembles football. All this can be fixed with more development time or better hardware. I really just cared about getting it into a minimally playable state to see if it could be fun.

So is it fun? Hell yeah. There’s something about playing a “game” using your body as the controls. I know other VR games obviously take advantage of hand tracking and whatnot, but there you’re mostly pinching air or making hand signals. In this, you’re scrambling to dodge rushers and throwing a ball using your entire kinetic chain.

Thoughts on playing

I was surprised by how difficult it was to throw. When you’re just playing catch, you have all the time in the world. In this, the pocket collapses or the routes develop in just a couple seconds, so when you make that snap decision to throw, your feet might not be set or your body might already be open to the target (sidenote: In other words, playing QB is hard. This should be fairly obvious, since this is a job that pays about $60m/year as of 2024, but 32 NFL teams can only find ~15 competent QBs! ) . There were the occasional plays where my feet were set, I generated power from my legs and hips, felt the ball spiral off my index finger and saw it go exactly I wanted, right on time. Those were euphoric.

But usually I'd just throw an interception into triple coverage like a king.

The wonderful thing is that this is all virtual, so we can tweak the difficulty as much as we want in the name of fun. For a beginner, we could literally just slow the game down! We can make the receivers and corners slow enough that routes get open in 4-5 seconds and the O-line good enough that you usually get that long to throw. As you progress in skill, higher difficulty levels would be closer to real life speed.

labeling
Managed to not hit this kid with the ball.

It’s also obvious a production version of this would need a batting cage style set up with nets to catch the ball after you throw it. Throwing the ball willy-nilly while you’re in another world is kinda ridiculous. Similarly, running while you’re in another world feels dangerous - a final product would probably be more mixed-reality and use the Meta Quest’s passthrough capabilities more.

Further work?

Everyone I’ve shown this to has expressed interest in playing it, and everyone who’s played it has really enjoyed it. Unfortunately, making a real game costs several million dollars and distributing this - whether as a purchasable simulator, or a physical VR location - would probably cost several million dollars more. I have enough money to work on stuff like this for fun, but unfortunately not enough to pay other people to work on it. So it’s been sitting on the shelf (and collecting dust! I actually finished this months ago and have been slow to write/upload).

If anyone else wants to work on this, I think I’ve certainly derisked the “Is this enjoyable?” aspect of a consumer business. The economics are still murky. I’d love to play a polished version of this as much as I’d like to make it myself. So if this inspires anyone or you just find it cool, I’m very open to sharing everything I’ve learned (sidenote: $sitename@gmail ) . Otherwise, after I make a couple million dollars, I guess I know what I’m blowing it all on. ∎