New AI model “learns” how to simulate Super Mario Bros. from video footage

Kyle Orland
2024-09-05 13:29:40
arstechnica.com

At first glance, these AI-generated Super Mario Bros. videos are pretty impressive. The more you watch, though, the more glitches you’ll see.

Last month, Google’s GameNGen AI model showed that generalized image diffusion techniques can be used to generate a passable, playable version of Doom. Now, researchers are using some similar techniques with a model called MarioVGG to see if an AI model can generate plausible video of Super Mario Bros. in response to user inputs.

The results of the MarioVGG model—available as a pre-print paper published by the crypto-adjacent AI company Virtuals Protocol—still display a lot of apparent glitches, and it’s too slow for anything approaching real-time gameplay at the moment. But the results show how even a limited model can infer some impressive physics and gameplay dynamics just from studying a bit of video and input data.

The researchers hope this represents a first step toward “producing and demonstrating a reliable and controllable video game generator,” or possibly even “replacing game development and game engines completely using video generation models” in the future.

Watching 737,000 frames of Mario

To train their model, the MarioVGG researchers (GitHub users erniechew and Brian Lim are listed as contributors) started with a public data set of Super Mario Bros. gameplay containing 280 “levels'” worth of input and image data arranged for machine-learning purposes (level 1-1 was removed from the training data so images from it could be used in the evaluation). The more than 737,000 individual frames in that data set were “preprocessed” into 35 frame chunks so the model could start to learn what the immediate results of various inputs generally looked like.

To “simplify the gameplay situation,” the researchers decided to focus only on two potential inputs in the data set: “run right” and “run right and jump.” Even this limited movement set presented some difficulties for the machine-learning system, though, since the preprocessor had to look backward for a few frames before a jump to figure out if and when the “run” started. Any jumps that included mid-air adjustments (i.e., the “left” button) also had to be thrown out because “this would introduce noise to the training dataset,” the researchers write.

MarioVGG takes a single gameplay frame and a text input action to generate multiple video frames.
The last frame of a generated video sequence can be used as the baseline for the next set of frames in the video.
The AI-generated arc of Mario’s jump is pretty accurate (even as the algorithm creates random obstacles as the screen “scrolls”).
MarioVGG was able to infer the physics of behaviors like running off a ledge or running into obstacles.
A particularly bad example of a glitch that causes Mario to simply disappear from the scene at points.

After preprocessing (and about 48 hours of training on a single RTX 4090 graphics card), the researchers used a standard convolution and denoising process to generate new frames of video from a static starting game image and a text input (either “run” or “jump” in this limited case). While these generated sequences only last for a few frames, the last frame of one sequence can be used as the first of a new sequence, feasibly creating gameplay videos of any length that still show “coherent and consistent gameplay,” according to the researchers.

Source Link

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: Ars Technica

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

New AI model “learns” how to simulate Super Mario Bros. from video footage

Watching 737,000 frames of Mario

Support Techcratic

Bitcoin QR Code

Gravitational waves unveil previously unseen properties of neutron stars

ESET Research Podcast: HotPage

Related Posts

Leave a Reply Cancel reply

Tech News

Tech News

Tech News

Tech News​

Site Links

Tech News