Reinforcement learning in Minecraft: Challenges and opportunities in multiplayer games

Video Information

Hi everyone and welcome to this webinar on reinforcement learning in minecraft challenge and opportunities in multiplayer games my name is diego pereira leona i’m i’m a lecturer on computer games and ai at kubernetes university of london i’m part of the gaming group as the same as two of my

Core presenters in this webinar look again and martin bala who will talk after me they’re also part of the gaming group there are psd students in industry of london and then we have a final speaker with sam dublin from microsoft research who will close this presentation during this webinar

We’re going to talk about four main parts the in the first one i’m going to introduce the main concepts and ideas we have general game ai research after me reluc again i will talk about different games that have been implemented in minecraft and can be used for this particular area of research

After raluca martin bala will be talking about how to train different reinforcement learning agents in malmo the the main platform for multi-platform game multi-agents games in minecraft and finally sam dublin will be talking about several open questions and research that is happening at the moment in this

In this domain so starting with the first block general game ai research traditionally in in game ai something that we have done quite useful quite often is to utilize different games and doing different research on those games so for instance you could create an ai agent that is going to play one particular

Game could be a game on a 3d game like minecraft could be games two-dimensional games like for instance sequest or could be other navigational games in which you have to maybe control a ship in a in an environment and then maybe visit a series of waypoints normally the ai the way it’s

Focused is you have a particular agent and you try to create this ai so it plays the game as best as it can while this advances the field in many cases we are interested in similar approaches that could focus on one ai agent that is able to play multiple

Games this allows the researchers to focus exclusively on the ai methods on the actual technology behind the agents rather than in specifics on every game so we can advance ai in order to to make a ai method that are applicable to many different domains and not only one particular one

Examples of these can be found in the literature one example you might have heard of is the work on the atari learning environment with using deep convolutional neural networks and deep qns that allow you to receive as an input image of the game and then analyze it try to extract features automatically

And then map that to the different outputs that the agent can take essentially the actions that the agent can take in the game and then by progressively and repeatedly playing these games he’s able to learn and he’s able to play better and better as time progresses another example for could be the general

Video game ai foreign competition this is something that we have developed at queen mary and essentially the idea is that you can play multiple arcade games in in a fashion in which you don’t have to train on them you have access to the model of the game

And you can plan which are the best actions and you can play multi multiple games like could be space invaders or soccer band or things more complex like for instance portals or other games like for instance could be also butterflies or orcella this was against the competition and allowed people to train

And prepare agents even not knowing the games that they’re going to be playing with later on in a more general case more general example we have stratega which is a general strategy games framework that we have been developing this for instance try to tries to tackle more complex situations in which you

Have games that require decision in multiple dimensions like for instance you might be defining how to do research or different technologies or you might be controlling multiple units that you need to be managing at the same time or you might also try to for instance a managed economy

Or playing against multiple enemies in this case we are developing this this framework that allows you to play multiple different strategy games at once and trying to focus research on the difficult questions of decision making for these domains but one clear example of one one of the main platforms at the moment multi

Games development is the project malmo which is developed by microsoft and is using minecraft as a platform for doing ai research you can find all the information in these links as well but in general this takes the game minecraft with all these complexities to offer a rich

Diversity of games that you can use for training and learning with your agents in these environments so the main design principles of malmo are that they have a low entry barrier so you essentially can can create your own agents in the in the language that you prefer could be java.net c

C plus part python or lua it’s cross platform so you can essentially use it in multiple operating systems like windows linux or macos uh in general incorporation api for agents so you can create an agent biasing or basing your your um your agent in several functions that you can

Implement in order to interact with environment the actual environments as well can be created in xml so you can essentially define how does the world look like the type of objects the type of elements that are incorporated into that and also you can incorporate how is the task itself that you want to

You want to develop like you can define the actions that are available for the agents you can also define different rewards that exist how does the episode end what you have to do to win sense so given that you have all these possibilities all these languages to define different games

You we are moving in malmo from just executing agents in one particular domain to arrange the wide range of domain that you can you can work and this allows that to focus in a more general multi-task learning rather than focusing on the narrow ai of one simple environment not only that

But also you can use malmo to have multiple agents in the same game so for instance if you’re playing maybe a football game or when you capture the flag like the examples on the screen you could have multiple agents each one of them controlled by a different ai algorithm or controller they are

Trying to either compete in the same game or trying to cooperate to achieve a common goal or even you have a human who’s playing with all these ais and there is an interesting aspect of collaboration and competition between the ais and the humans that can be also explored in this domain

This is the introduction from project malmo and now rolex again is going to talk about the different agent different games that have been created examples of tasks that are interesting for doing ai in this particular environment thank you diego my name is alka and i will be talking about the multiplayer games

Available in the framework we currently have three different games that are all parameterized with procedural level generation behind them as well and that is to allow for many variations of each of the environments to be generated to allow for diverse and challenging training settings for general players the first game

Is mob chase previously known as pic jays which was used in a previous microsoft collaborative ai challenge the point here is for two different agents to corner a mob in a fenced off play area they get a large reward for catching the mob and a smaller individual reward if they

Move to an exit block instead which could be the better choice if the partner of the agent is not actually cooperating in order to play this game the agents have three actions available here they can move forward backward or turn they have the goal the main goal of catching the mob which

Awards them the most points and they have a secondary goal of reaching an exit which should also end the game but award them a fewer number of points and all of the games that we have here also have a maximum number of commands that can be sent by the agents

So this is to encourage them to actually complete the problems as fast as possible with the current parameter options which can be easily modified and increased to cover more potentially interesting spaces we can get over 6 million variations of mob chase not including the various level layouts resultant from the chosen parameters

We can vary the look of the game the number of objects in the level and their position within a varied size play area all of which can be used to also tweak the difficulty of the game if needed these are some examples of what the games generated

Might actually look like in practice mob chase specifically targets ai skills based on cooperation chasing navigation and exploration the next game that we’re looking at is build battle in which two different agents compete here to build a given structure one agent placing a block correctly awards in points which subtracts from the opponent

Here the agents can move forward backward or turn but they can also jump and they can place or destroy objects the main goal of the game is to copy the given structure with towards one point but here we can also see a more granular reward structure we towards a small

Number of points for placing correctly some blocks or removing incorrectly placed blocks and the opposite for placing blocks in incorrect positions or removing blocks that have already been placed correctly and we have an opposite reward scheme for the opponent here we can obtain over 10 000 variations of the game

Not including different level layouts and we’re varying things like the size of the structure to be built and the overall level size the look of the game through the types of blocks used and the percentage of the structure that is already built for the players to generate interesting challenges of varying complexity

And to look at this game in action we can see two different instances here build battle specifically targets ai skills based on navigation puzzle solving and resource management and lastly we have treasure hunt in which one agent is defenseless and has the mission of collecting treasure the brightly clock the brightly colored blocks

In the pictures and videos shown here and then make it to the exit of the dungeon as well the other agent is equipped with a sword and must protect the collector from the enemies as they navigate the dungeon together multiple teams of collector protector players could be added in the same game

For an interesting competitive angle and in this game the agents can again move forward backward or turn but they have also different about abilities depending on their particular role in the challenge the protector can attack enemies and the collector can interact with the items in the level and collect them

Goal overall is for the collector to reach the exit and we again see here a more granular reward structure with points awarded for each of the treasure blocks being collected as well as agents being penalized if any one of the ones in the same team die over 3 billion variations of this game

Can be created by varying the look of the game combat aspects and the number of enemies faced as well as the overall and detailed configuration of the dungeon in these examples we’re only seeing some one-room play areas but larger and more complex dungeons can easily be generated

And this is what the game looks like in action treasure hunt is one of the more complex scenarios that we have available which targets many ai skills such as navigation exploration object transportation and escort aiming defeating enemies chasing fleeing and obstacle and harm avoidance overall the games that we have available

In the framework pose a wide variety of challenges requiring the agents to exhibit many different skills in order to successfully solve all of the problems thank you very much for listening and i will pass on to martin we’ll be talking about actually training agents to play such games

Thank you thank you rolker hi my name is martin bala and after seeing how the missions are in mamo we turn our attention to how to get an agent to successfully play these games so for this we use reinforcement learning in reinforcement learning an agent interacts with an environment initially

The environment gives gives the agent an initial observation and based on that observation the agent picks an action the action gets into the environment and the environment updates its internal state and returns the next observation with a reward the agent’s objective is to accumulate the highest discounted reward

As it can the observations in minecraft are the are in the form of rgb pixels and the action space in the simplest form can be moving one grid forward turning 90 degrees left or right and the reward is in the simplest case minus one or one one is

When the agent successfully completes a mission a minus one when it fails the next is to have a look at how to actually set up marmo on your own machine it got a bit simpler than with previous versions so the first step is to get java version 8 and python 3 installed unfortunately

Minecraft runs with an older version of java so it requires java 8 specifically it wouldn’t work with newer versions and after you get java and python installed the next step is to clone the repository and if you got this cloned then you can install momo and using pip

And if you want to run the examples and the tutorials from the repository then you can optionally install the example sub package but if you want to get it to run on your own machine refer you to your to the project’s readme so we use rliv for our examples

Rliv is part of the ray project you might have come across this as ray is a popular python package to speed up computation by parallelization rle provides high quality state-of-the-art easily scalable rl algorithms so you can run the same algorithm on your laptop and easily transfer it to a cluster

And you just have to change the resources that you have available so you can specify more cpus and more gpus and rlev automatically handles that and it supports both tensorflow and pi torch so if you don’t want to learn the other framework it’s not a problem we prepared a few

Notebook tutorials to lower the entry barrier for mamo we recommend you to go through in the order listed in the slide as they build from basic concept to more advanced ones so the first tutorial is about how to run a random agent in malmo it shows you how to set up the environment

And then it just samples random actions at each time step the next tutorial shows how to use rlev and run a ppu agent in a sample mission then we go through how to restore a checkpoint and how to evaluate the checkpoint and finally we are going to show you an

Example of how to run a multi-agent experiment in mamo and rliv the first notebook is about how to run a random agent in mamo so the first thing we have to do is create the environment calling manuem.make and then for the initialization we have to pass the mission file

With a port and that port is going to be used to communicate between java and the python process and we have the launcher which automatically starts up the java minecraft instance it just requires an array of ports in this example we only use a single port and a launch script that we explain

In the next slide and then we can just use the normal reinforcement learning group and in this example while the episode is not over we just sample a random action at each time step so it’s worth highlighting how the launcher works so previously you have to manually start up the

Minecraft instances using a gradle process for each environment that you wanted to use and instead we created a python script that does it for you and the launcher takes in an array of ports and for each port it starts a new minecraft instance and it also takes a launch script the launch script

Is a bash file that defines the launch options for minecraft so by default it renders a window on your desktop but that doesn’t work on for example had less linux machines if you want to train it for longer periods on a server and you can use xvf to export the display and

That helps you running it that was one of the issues that we had with the earlier versions of mamo the next example is how to run a ppu agent in malmo ppo is one of the state-of-the-art reinforcement learning algorithms it stands for proximal policy optimization and we are using the tune api

That helps you run an experiment so it’s quite straightforward to run it in the first line as the first argument you define the algorithm so in this case we use ppo then we give a config which is a python dictionary that defines the algorithm’s parameters so it defines the

Learning rate and the available resources so you can define how many cpus you want to allocate and how many gpus then we set the stopping conditions that it makes the training stop after a certain number of environment agent interactions and then we have some checkpoint argument it creates a checkpoint at the

End and also throughout training after every few in the algorithm iterations it makes a checkpoint and finally we set the log there so everything that rle blocks it just saved at a specific place after training we can visualize the 10 server output that rlip automatically saves we don’t have to wait

Until the end of the training we can visualize it during training and we have some a few example what kind of data we can visualize so it shows the average length the episode text 12 training the maximum and the average rewards that it collected during the episode and

Rlip collects much more data so for example you can visualize even cpu usage or memory usage and sampling time in milliseconds there are much more these are just a few examples and then after we trained an agent we can visualize what it does the gif on the right

Is recorded using the screen recorder that we also provide in the framework it takes the observations that the agent gets and saves them as i give or an mp4 file the the mission was the single agent mob chase in this example the agent just has to navigate to the brick block

That the agent does very nicely on the gif so in the next tutorial we have a look at how to restore a reinforcement learning agent in marmo so one use case for this is that you train an agent you have the checkpoint but you want to visualize

What it actually learned so in this case you can just change the configuration that you you may want to just use a single cpu and for example no gpu for the visualization you can switch off exploration so the agent picks the best action it can

At each time step and then we can use the same radio tune function as before but now on the last argument we use the restore flag and we provided the checkpoint file that we had before and then we may want to evaluate an agent as you’ve seen the tune api doesn’t have

The same flexibility as we have a normal agent so we may want to use a different level or we may want to get more insight of what the agent does in the environment so in this case it’s better to load the agent’s trainer so in this case we load the bpo trainer

We restore the checkpoint file that we had and then we can directly access the policy in the reinforcement learning group that we’ve seen in the random agent example we can call a policy.compute action and give it the current observation and it doesn’t just return the action but it also provides the

Action distribution the value function and any other algorithm specific quantities for example it can provide q values and then we can use that the best action that the algorithm has and send it to the environment and then before we move on to the last tutorial on mutagen reinforcement learning we

Have a look at how the mutagen setup works in marmo so so far we only had single agent examples where each minecraft instance was attached to a single agent in the multi-agent setup there are multiple roles and for each environment there is an agent withdrawals 0

That’s going to act as the server and all the other roles are going to act as clients and once we establish the mission all the clients connect to the server and they have this synchronized observation and to do that in code we have to use an additional helper function to create a multi-agent

Environment where we define the agent configuration so that’s where we assign the roles to the agent then we use these turn based rla voltage and and wrapper it’s turn based in the sense that each agent acts at the same time so they are not asynchronously acting in their own time and then

In this in a similar rate the tune the trend function now we use the multi-agent argument where we define how the policies are defined so in this example we use a shared policy which means that both of the agents are going to have identical weights but they act in a decentralized manner

So they don’t share information or observation within each other to do that we also need to define the observation space and the action space and a policy mapping function but you can find more information of this in the rlip documentation okay so after you’ve trained the ppa agent on the multi agent

Setup you should see something like this on the left side the agent is withdrawal zero it acts as the server and the right size window is the client and in this mission the agent should collaborate and catch the chicken or they can decide to not collaborate and just move to the

Sand tile which one of the agent does and sam in the next section is going to talk about ways that you can make reinforcement learning agents learn how to collaborate so to sum up what is in the queen mary malmo repository we added the launcher that wasn’t in the previous repository

And the updated p package so it’s more convenient to call the launcher for example and also we have some updated documentation we also have the notebooks that i mentioned previously and we also provide normal python scripts so if you don’t want to use the notebooks it’s fine you can

Use the scripts instead that might be much easier to use in your own project and we also provide some additional helper functions for example an observation wrapper that just converts the minecraft observations to any arbitrary size and we also have a video recorder that you could see the single agent

Recording earlier and we also have a symbolic representation extractor so so far all the examples were based on image input and the symbolic representation extractor it gives you a top-down view of the symbolic representation of the environment that could be helpful for your project finally we provide two ppo checkpoints

One for the single agent mob chase mission and one for the two player watches mission next sam is going to talk about how to learn to collaborate thank you martin so i’m sam devlin a senior researcher at microsoft and in this final section i want to talk a little bit about

What happens uh when we try to apply single agent reinforcement learning to multi-agent tasks such as the games that we’ve talked about so far today in this section i want to include some of our recent research that provides scalable approaches to learning coordinated policies in these complex games

So in all the work that we’ve seen so far today we’ve seen issues when applying directly applying single agent reinforcement learning algorithms to multiplayer tasks this is why the agents in martin section didn’t coordinate in particular the challenge is that if we just naively place multiple individual learners into the same environment

The environment appears non-stationary to any one of them that is to say that the same observation and action matches the different outcomes as the other agents are also updating their policies at the same time in the environment this causes issues with breaking fundamental assumptions in how single agent reinforcement learning algorithms are designed

Uh and in particular for deep reinforcement learning where it’s common to use a replay buffer it can cause issues where these experiences become stale because they’re dependent on the previous policies of the other agents in the environment an alternative approach is to just group all agents as one so if we’re trying to

Learn a a joint policy for a group of agents that we’re controlling then why not just stick them all into one big agent that controls all of them this can be done certain use cases but it does lead to an exponential increase in the state action space

Which then leads to even more training time needed and as deep reinforcement learning algorithms are typically very sample intensive having this exponential increase in the state action space can be very problematic making it intractable for many people to be able to train agents in this way from our perspective looking at gaming

Applications even worse than this it can be considered as cheating a lot of these games are designed so that you have a partial observability based on where you currently are in that space so if we allow all the agents in the game to see what everyone else is doing

Then that’s not how the game is played one way to work around these two issues is to use the paradigm of centralized training for decentralized execution so in this approach the agents are considered as one whilst training so we use all of the data that can be generated from all agents

During training and then we learn policies that can be decoupled at the time that we deploy them so this framework is used a lot in many modern deep multi-agent reinforcement learning approaches it uses the same assumption as many recent distributed reinforcement learning algorithms such as impala and it’s the simplest but most

Effective way to to quickly learn a coordinated policy in particular this gets used to learn often to learn a centralized critic and there’s an example implementation in the rl lib docs for how to do this with the setup that we’ve provided and talked about in the earlier sections

In any project where i’m trying to learn coordinated policies for multiple agents this approach with a centralized critic would always be my first go-to as a safe bet for potentially learning a reasonable enough policy as we dive deeper into some of the problems that come up

I want to take a look at this particular instance of one of the games that we talked about earlier in this situation uh one of the agents has trapped the mob in the far corner and the other one is not really doing anything it’s just standing here in the corner looking at

What’s going on in a team game where both agents are rewarded the same based on how they’re performing as a team both agents here would receive a positive reward because one of the agents has captured the mob but the other agent who’s not really helping will also get a positive reward

And so may learn that standing in the corner doing nothing is a useful behavior this is known as the multi-agent credit assignment problem how do we break down a global reward so that each agent understands what its contribution is and learns policies that are actually useful to contributing towards the team One approach to tackle this is difference rewards uh so this was originally proposed by david walper and khan toomer in a 1999 nasa tech report under the name the wonderful life utility formally this takes the approach where instead of receiving just the shared team reward that the game

Gives each agent instead receives a shaped reward which is the difference between what the global team reward was and what it would have been if that agent didn’t exist or had followed some sort of default policy by doing so we get to reward the agents based on their actual contribution

To the game rather than how the team is doing overall if each agent tries to maximize its contribution then typically the team will perform better this is a multi-agent specific form of reward shaping designed to remove the noise created by actions of the other agents in the environment

So each agent gets a clear signal about what they are actually contributing to the game so this approach was proposed originally in 1999 and there was over a decade of very successful applications of this to collaborative games however it made one fundamental assumption that you had direct access to the reward function

So that you could calculate that right hand far right hand term Where what the global reward would have been if you weren’t there or if you’d taken a default action this isn’t always possible and so in this 2014 paper uh by mitch colby khan and colleagues they propose to instead that you learn the reward function so that you can then

Query it with this for this second term to find out what it might have been if you weren’t in the environment this line of work was then built on further by jacob forrester and colleagues in their triple ai 2018 paper that extended it into a deep reinforcement learning approach

Where they instead use the value function of a centralized critic so again they’re using this uh paradigm of centralized training decentralized execution they learn a centralized critic and then they calculate the difference rewards of the value function from the central value function instead this allowed them to scale up to

Some tasks within starcraft and other complex games using the the flexible abilities of deep nets as a value function approximation in a more recent upcoming paper that i was involved in have proposed doctor reinforce another deep reinforcement lending approach that instead returns back to the concept of learning the reward function

This is because learning a centralized q function can be prohibitively challenging so if we can learn just the reward function rather than having to learn the more forward looking q function we might be able to get a more efficient way of estimating the difference rewards while still gaining the benefits that

Coma had by using deep reinforcement learning to learn our policy let’s have a look at how this performs in practice So in this paper we we considered a smaller version of our mob chase game where we have three agents trying to capture a mob or prey pictured as the red square by experimenting in this simpler environment we were able to compare against earlier methods that had direct access to the reward function

So that that past decade of work that i alluded to earlier when we have free agents like this we can see that all methods are performing relatively well the the lines in particular to look at here is the green line the top left corner as the best performing agent learns very quickly to

Achieve the highest performance this agent is the one that used the previous assumption that you had access to the reward function so it’s it can directly calculate the difference reward and not approximate it whereas our agent in the dark blue line which is having to learn the reward

Function online whilst the agents are acting is able to recover the same performance fairly quickly after that agent is but all agents uh are acting fairly similar um we see uh coma is also com competitive here uh and the colby agent from the original 2014 paper

One of the big things i would take away from this is that any time that you’re using difference rewards they are still outperforming the naive approach of just placing multiple individual learners in the environment however as we increase the number of agents in this environment we see that the effect size becomes

Larger and the methods start to separate and in this case we see that our approach using uh doctor reinforce where learning the reward function and still using a deep reinforcement learning approach to learning the policy is able to handle more agents in the environment which we believe is down to this

Difficulty in learning the centralized q function with coma which is again evidenced in the final poll on the far right where coma is now performing worse than the original porridge by colby despite having a more powerful function approximation for the policy whereas colby has a more limited version but is again learning

The reward function and not the joint q value we’re still exploring the cause of all of these differences and how this approach scales to more complex environments uh but this is one approach that has been tested in a wide variety of scenarios for learning more coordinated policies and overcoming that challenge of

Multi-agent credit assignment so i want to move on now to a slightly different problem and something we saw when we proposed originally proposed these tasks as a competition in 2018 in that competition we had a number of participants submit agents that perform very well when playing the games with another

Instance of their own agent right so if we have the the two agents in the left box they perform very well with another instance of itself uh similar with the ones on the right however if we took one agent from each of these uh these teams so these are two agents

Trained by different teams but both perform well when when in a team with another one of themselves they don’t necessarily perform well together there can be miscoordination due to assumptions that the agent is making about the other agent in its environment even more bizarre we found that if you train two instances

Of the same agent using exactly the same code base but just a different random seed they can often be uncoordinated and not play the game well together this is known as the ad hoc teamwork problem what we want is agents to to be able to play with any other agent without any prior

Coordination so the agents in this competition so far have been have strictly relied on the fact that they have trained with the other agent and that they’ve formed um concepts on how they should be taking different roles within that game but ultimately we want an agent that can recognize who they’re playing with

And adapt to them so they play online well with them formally this extends the normal multi-agent mdp objective which tries to maximize the accumulated reward of all agents in the environment to and the extension just includes an expectation over the other possible agents so we wanted to perform well on average

Across all agents that it’s going to play with in the environment this was uh particularly well summed up in this uh challenge paper from aaai 2010 by peter stone and colleagues where they talk about human ability of ad hoc teamwork to do things like play pickup games at basketball

Right we should as humans be able to play a game that we’re good at with anybody that that we meet maybe with a brief period of coordination at the beginning but we shouldn’t completely fall apart and not be able to adapt to them and that’s ultimately what we want for

The agents and what we’re trying to challenge trying to achieve when taking on the ad hoc teamwork challenge so to go after this challenge we proposed this method recently that’s upcoming at the armas conference this year so this looks like a a nightmarish bit of a network

But i can break this down into four simple stages where this represents the network that is our policy for this agent first we observe the behavior of other agents in other in our environment this can be done online once the agent is learning as in this paper

Or from a power batch of data for instance replays of human players playing this game on public servers our network architecture includes an information bottleneck for which these observations must be compressed from this compressed representation we then try to predict the future actions of the other agents

The error in our predictions can be used to update the parameters in the blue encoder to maximize the information retained from the observations that is needed to predict the future actions we factor these per agent which allows us to scale well in the number of agents

But you could also do a joint prediction hero for those actions of the other agents from that that training of the uh the top two sections we get this compressed representation that is attempting to capture our belief over the other agents um we did this in a way so that there

There are two versions separated here one that that will stay stable throughout the entire episode which is our intention that is trying to capture the play style of the other agent and another that changes per time step which is hoping to capture the current mindset so you might be playing with someone

That plays in a particular style but recognize that they’ve maybe gone into a particular mode within that style and we do this uh with a variational auto encoder like structure so we are both capturing our our a current mean estimate but also some variance over this to try and capture some uncertainty over

Our current belief of the other agent’s playstyles and mindsets in the final stage we can then condition our policy on that current belief of the other agents so instead of just trying to choose an action based off of the current state we choose our action based off of both

The current state and our current belief of the other agents in our environment if those beliefs are currently highly uncertain our agent may learn to perform information gathering actions to infer more about the the other agents in the environment that they’re acting with or if it’s in a critical stage of the

Game it may choose to act despite this uncertainty for instance in the mob chase scenario if the mob was about to escape the agent might choose to capture it even if it doesn’t know that the other agent will be there to support it either way the agent now adapts to others in

In in its environment instead of assuming that they will adapt to accommodate the agent finally to demonstrate this approach in practice i’ll show this on another small game that we used in the paper so in this game we have two agents that are collecting coins

That they want to take to the bank red coins must be taken to the red bank blue coins to the blue bank the team is rewarded collectively so this is a fully collaborative game where they both want to maximize the number of coins they can

Take to the bank in a fixed period of time we are controlling the agent with the fort bubble but we have no control over the agent in the bottom left corner this agent might have a preference for particular types of coins might prefer to take the coin that’s

Always closest to it it might always try and take the coin that’s furthest away from it depending on how that other agent is acting our agent needs to recognize it and adapt and play the game differently and what we see when we apply our method in this this approach

So our method is here is the bamel approach uh we see far higher average return than some comparative methods from the literature so first we have the dashed line dash gray line which is a typical model free approach with a feed forward network for a policy so this agent takes no consideration of

Other agents in the environment it’s just learn on average how to best respond to all of the agents it’s seen during training alternatively our approach can be seen as a metal learning approach where it is learnt over a population of other agents that it’s trained with so we compared to the rl squared

Algorithm with the green line which is a state-of-the-art meta-learning approach with a recurrent network as we can see both of our approaches significantly outperform these these two baseline approaches if we look a little deeper into this and we can also see a potential cause for why this is happening

So in this next plot what we’re trying to do here is predict what the other agent is in the environment so this wasn’t a part of the training loop for either agent but what we do is once the agents are trained or at various time steps throughout training we take the current

Intermediate representation and attempt to predict from that what the other agent is so this is a separate supervised learning problem uh just to use to probe what is what is being learned in the intermediate representations of these agents and what we see is that from the intermediate layers of the rl squared agent

It remains throughout training quite challenging to predict what the other agent is doing so this agent is not retaining information about what the other agent in the environment is doing whereas with our beymol approach that intermediate representation can after training be used to to accurately predict what the other agent is

Showing that it is retained information critical to understanding who this agent is playing with so to close i i just want to summarize some of the core problems that occur in multi-agent reinforcement learning and some of the methods that we’ve talked about today to overcome them first if we just naively place multiple

Single agent reinforcement learning algorithms into the same environment then we introduced the problem of non-stationarity which breaks many standard assumptions in rl algorithms causing them not necessarily to converge towards an optimal policy secondly we have the curse of dimensionality where if we put all of these agents into one monolithic agent

That has to control all of them then we get an exponential growth in the state action space which can be intractable to learn then we talked about the multi-agent credit assignment problem where it’s hard for an agent to understand from a global team reward what it did to contribute towards that that team

Score and so here we talked about difference rewards as an approach to give more informed credit to the agents that actually contributed to the success of the team and finally we talked about ad hoc teamwork and this is the problem of an agent that has to generalize to another agent

Which it hasn’t previously coordinated with so how do you learn an agent’s policy that can generalize to a wide range of other partner agents that they might play with we talked a little bit about some approaches to this but interested listeners uh i would recommend these

Two surveys too that cover a wide range of past methods both uh from the sort of deep pre-deep reinforcement learning era in the first survey from 2008 and also a very recent survey that covers a large portion of the the approaches that have been proposed more recently in the deep reinforcement learning paradigm

The approaches that i’ve covered today have not yet been tested in the multi-agent minecraft tasks my collaborators presented earlier so i invite all those on the call today to try these approaches out and would love to hear the learnings of anybody in the audience that gained uh insights by applying these approaches

In those multi-agent problems in minecraft finally before we start the q a session i would just like to invite you all to join us for our upcoming ai and gaming research summit which will be taking place on february 23rd and 24th 2021 there’s a link there for registration and for

Any of the resources that we make available after the event uh this would be a great opportunity to learn about far wider range of research in this area looking both at ai agents in other settings but also topics such as responsible gaming computational creativity and understanding players thank you

Hello everyone many thanks to all of you for attending the refusal learning in minecraft seminar today and also for staying for this live q a so i’m diego perez i’m one of the corporations of the presenters of this webinar with me uh roluk again matiba and sam

Dublin who will be also answering your questions um you have probably seen that we’ve answered already some of your questions in the chat and now we’re going to take some extra questions that have been submitted by you uh we’ve selected to reply live now so let’s get started with this

Um the first one is uh it’s a question submitted by lucas from university of edinburgh and i’ll answer the question now so did you observe for the game ml meta learning work that the variance and uncertainty of the v a e was interpretable could you for example observe that the variance increase

Whenever the other agents executed behavior rarely or not seen during training before and hence including the play styles could be difficult um fingers sam is gonna answer this sure thanks for the question lucas uh so in in the example that we showed in the talk

Uh it was a little harder for us to interpret what was in the later representation learned by the vae due to the size that we uh we chose to encode it as however in the paper that supports this work we did look at a game with

Where we were able to use a far smaller latent representation and there it was a very interpretable representation that was learned so we see things like that it was counting the actions that the other agents were taking that it was clearly separating the different types of agents that they trained against

In all of those cases what we did see was that the variants would reduce over time when it was recognizing a behavior so once it had seen from the history that it it had more confidence over which agent it was playing with but what we didn’t explore was looking

At how do we generalize beyond agents that are in that that training set so i think that’s the the really exciting part in in lucas’s question about having the agent actually recognize uh when it’s playing with someone that it doesn’t recognize i think that’s a really key thing for

Future work these agents need to be able to learn how to acknowledge that they’re in a position where they don’t understand what’s going on and so take more information gathering actions uh i know lucas and the team at edinburgh are also working on this so really keen to see what they

Do in that space maybe has something to to share with us all soon all right thank you um this next question is by the way tongue from q11 uh there are actually quite a few questions in one so i’m going to split this in in two parts uh so starting with uh with

The first one is the sign of the structure in d neural network important in il uh do you use a general neural network for different games and are these hyper parameters of the dnn tuned for every game yeah so another good question yeah i can jump on that

Um so another good question uh obviously ideally we wouldn’t want to have to be tuning these specifically for every single game but in practice that’s still the way we’re mostly doing things uh there’s a very interesting line of work on making dparel more generalizable more robust and this is something we explored

I posted a paper in the questions to our europe’s 2019 paper on this topic so we’re definitely exploring options for for network architectures that allow more generalization we’ve particularly found that the uh variation null information bottleneck is beneficial for doing so but still a wide open question particularly for looking at that

Challenge of generalizing across different games uh we’re also exploring this from another perspective of uh working with colleagues at msr new york from a more cognitive neuroscience background are looking at more sort of neuro-inspired architectures that can encode more human-like priors to get more human-like behavior

Out of our agents that we hope would then be broadly applicable to a whole range of games and as a follow-up if the this input is an image do you need a state-of-the-art image processing or a simple model can be used so so this is somewhere where our team

Differs a little bit we we work very closely with game studios and with the uh in this instance with the malmo uh the way malmo is set up we can get access to more low-level game state from the game itself uh we don’t see an advantage

In many of our games for operating at that sort of per pixel level so a lot of this work is done off of a lower level game state taking access to taking the advantage of the fact that we have access to the game so you can learn a lot more sample

Efficient than uh doing things at the the pixel level okay fantastic um we’ve got another question from ashwin uh with vinay from the university of buffalo considering that this is a stochastic environment uh how do we how do the agents act based on the probability and are the choice of states

Random or based on certain transition priority and i think some can also reply that sure so um for the majority of the work that we do we use a policy gradient or actor credit based algorithms so that we can learn stochastic policies particularly in multi-agent settings this is very important

I think quite follow the second half of this so the choice of states uh will very much be based off of the transition probability within the environment it is probabilistic but these algorithms are able to deal with that the part that we do sometimes randomize though is about uh generating lots of different

Instances so that the agents get trained in different settings uh this is both for um making the agents more robust it’s a method popularly known as domain randomization uh but it’s also something that we do in in these uh these environments when we ran them as competitions to ensure that we could provide

Test set environments that the participants had previously trained on all right i hope that covers it okay there’s another question by hamsa sorry if i’m pronouncing the the names incorrectly from ryerson university um so i’m a student i don’t know too much about employing reinforcement learning in games

How would you recommend i get study with this field uh and i believe martin bala can answer this there you go yeah so this is a good question i would recommend starting by learning a bit about the theory of reinforcement learning and there are some good lectures

On youtube that you can follow um and i recommend you picking up item programming if you don’t know it already and then open ai scheme have very good examples to get started with it okay so thank you let’s see other questions from kimbeau lee from university of cambridge recent researching draft new networks

Also have been actively investigated to build up decentralized multi-agent problems related wars show that this can be helpful for generalization performance that train in 10 agents but tested in a larger scale as also addressed in your work have you considered the applications of gnns

In your team um i know some if you can answer this in your team sure um yeah so it’s not something we’ve experimented with yet um but it is a very exciting direction uh i again i know that uh jing bao and the the group at cambridge are doing some

Really exciting work in this space so uh we’d love to see uh if that can scale to to these tasks in malo like this this is why we put these environments out there really is to see uh see how the uh community can sort of engage with these problems and try out these

Different approaches uh there’s only so much bandwidth within one team to try out new ideas but graph neural networks are definitely a very powerful tool for approaching these okay thank you we’re gonna leave a couple of minutes in case somebody else has another question

So i did i did have one uh that that i didn’t get quite around to responding to in my in my badge uh so from sagar ubretti at the university of warwick asked how we define mindset in the uh the paper on ad-hoc teamwork um and so here we we we weren’t

Explicitly defining them essentially we have two variables in late and representation one that’s only sampled once per game and one that’s being sampled per time step and so really the the labeling of it as a mindset is more just a way for us to uh interpret and communicate our intention

With those two different components it’s actually something that was originally defined in the machine theory of mind paper a few years back um i can’t remember the reference directly from my head but it’s it decided in the paper for where that one came from but yeah so the mindset i believe was

The per time step one so we capture the the overall play style with the per episode so you have a particular play style that you’re dear to throughout uh when you play whereas the mindset is more that sort of moment-to-moment reaction and how these might change based on what’s currently going on

That’s great thank you uh this is another question uh by uc liu from qmul uh he says hello in terms of learning all agency monkeys and platform uh how to learn the other agent efficiently uh given the data for the region is is too limited in addition come the method of learning

Other agents being implemented into the national game to learn about the point again um well actually multi-agent learning is something that has been has been researched for for many many years that is one of the main research questions that is at the moment and normally what the people

Try to do uh the most said to to implement or create a model of the of the opponent by basically trying to analyze uh the title and the policy that this opponent is going to be executing uh by basically starting data for playing uh often i guess i guess with these

Games versus this uh this agents and trying to build this this model uh in a in a more or less dynamic manner um i i don’t know if if anyone else wants to add something else to this or any particular ways in which maybe microsoft is doing this

Your research yeah so so i think um the the example i presented at the end was probably the the best one i i have to mind was learning that model uh in a competitive game it’s absolutely the case that if if you learn the model then that that’s a way to exploit

A particular opponent um beyond playing just a nash equilibria that’s safe against and robust to a whole range of opponents um in the in the paper that’s linked for the ad hoc teamwork setting we do have a game that is uh more competitive than than the example that i showed so

It is equally applicable in that setting but i’m really motivated by that application of using it for fully collaborative games i mean ideally what we want is for these agents to empower the player and enable the player to do more rather than exploit them or try to beat them

It’s quite easy to make agents that are very good at games and be people it’s much more fun to create agents that can allow more people to enjoy games okay there are no more questions coming up we can wait a minute just in case somebody’s typing

Probably a good chance just to make sure everybody’s seen the the invite in the resource list to our ai and gaming summit next week uh so the registration is open until tomorrow so please do sign up today if you’re interested there’s a there’s going to be a lot of

Work there that’s going to span a range of different topics in game ai uh including lots of stuff on uh collaborative agents collaborating with humans uh but also going into things such as computational creativity responsible gaming and understanding players so hopefully of wide interest to people on this call today Okay so i think we we might be ready to start wrapping up um so thank you thank you for attending today we appreciate your participation your questions your interest in the subject this tutorial is going to be available on demand very very shortly as after these q a finishes

And if you are interested to learn more as some just said we have a list of great resources in the resource list to the right of your screen uh so as you can see it was there we do have the presentation slides that that this webinar we’ve been using the webinar we

Have a couple of links to the project malmo the actual research project in microsoft research and also the uh the link to the latest repository in github so you can you can find the the final examples on the final code that has been implemented there are also a couple of reference to

Revlon papers that we have been mentioning today during the during the webinar so there are there two to reference to archive papers and as also some just mentioned uh the latest one is is a link to the ai and gaming recent summit who’s happening uh next week 23rd to the 4th of february

Uh the recession is free uh but it’s gonna be closing tomorrow so basically try to reduce the bike tomorrow the latest um is it’s gonna be a very very interesting event there’s lots lots of different topics being converted in the event uh to two full days of work

And talk on rl and gamingi and so on yeah so let’s check that out and and we will we look forward to seeing you uh see how you build all of these research and how you prosper there uh the boundaries of rl and game ai and we’re looking forward to seeing your

Your work so thank you again for tuning in and have a great day

This video, titled ‘Reinforcement learning in Minecraft: Challenges and opportunities in multiplayer games’, was uploaded by Microsoft Research on 2021-03-17 20:22:48. It has garnered 3468 views and 81 likes. The duration of the video is 01:07:25 or 4045 seconds.

Webinar starts here: https://youtu.be/bhHWmwSixJw?t=59

Games have a long history as test beds in pushing AI research forward. From early works on chess and Go to more recent advances on modern video games, researchers have used games as complex decision-making benchmarks. Learning in multi-agent settings is one of the fundamental problems in AI research, posing unique challenges for agents that learn independently, such as coordinating with other learning agents or adapting rapidly online to agents they haven’t previously learned with.

In this webinar, join Microsoft researcher Sam Devlin and Queen Mary University of London researchers Martin Balla, Raluca D. Gaina, and Diego Perez-Liebana to learn how the latest AI techniques can be applied to multiplayer games in the challenging and diverse 3D environment of Minecraft. The researchers will demonstrate how Project Malmo—a platform for AI experimentation built on Minecraft—provides an ideal environment for designing different and rich training tasks and how reinforcement learning agents can be trained in these scenarios. They’ll provide examples of tasks, agent implementations, and the latest research done in this area.

Together, you’ll explore:

■ The Malmo platform and multi-agent tasks ■ Using the reinforcement learning library RLlib to implement and train agents to complete Minecraft tasks ■ Coordinated policies for collaborative multi-agent tasks ■ Open challenges in learning robust policies for ad-hoc teamwork

𝗥𝗲𝘀𝗼𝘂𝗿𝗰𝗲 𝗹𝗶𝘀𝘁:

■ Project Malmo – Microsoft Research (project page): https://www.microsoft.com/en-us/research/project/project-malmo/ ■ Project Malmo key repository (GitHub): https://github.com/GAIGResearch/malmo ■ Difference Rewards Policy Gradients (paper): https://www.microsoft.com/en-us/research/publication/dr-reinforce/ ■ Deep Interactive Bayesian Reinforcement Learning via Meta-Learning (paper): https://www.microsoft.com/en-us/research/publication/deep-interactive-bayesian-reinforcement-learning-via-meta-learning/

*This on-demand webinar features a previously recorded Q&A session and open captioning.

Explore more Microsoft Research webinars: https://aka.ms/msrwebinars

  • Wireless Power Shenanigans

    Wireless Power Shenanigans Minecraft – All The Mods 9 – Episode 19 | Wireless Power Lensmanoz is embarking on a new adventure in Minecraft with the All The Mods 9 pack as the foundation. This pack boasts over 400 mods, numerous quests, and a well-structured endgame. While it doesn’t contain all the mods in existence, it offers a diverse selection, including newer and lesser-known ones. Building a Minecolonies City Lensmanoz’s goal is to integrate tech mods seamlessly into the construction of a Minecolonies city. Despite facing challenges like unexpected deaths and limitations with the digital miner, progress is steady. Utilizing a second… Read More

  • Defeating the Twilight Forest Queen!

    Defeating the Twilight Forest Queen! Minecraft: Defeating the Queen of the Twilight Forest In a recent Minecraft video by S5João, the player embarks on a thrilling adventure to defeat the Ice Queen in the Twilight Forest. The video promises an action-packed experience filled with surprises and challenges. Conquering the Ice Queen The player, accompanied by a loyal wolf companion, faces off against the Ice Queen with determination. Despite the Queen’s formidable attacks, the player remains resolute in their quest to emerge victorious. Building Strategies As the battle intensifies, the player strategically constructs defenses and prepares for the Queen’s onslaught. With precision and skill, they… Read More

  • Escape Maze Challenge: Noob vs Pro vs Hacker – Minecraft

    Escape Maze Challenge: Noob vs Pro vs Hacker - Minecraft Exploring the World of Minecraft: Noob vs Pro vs Hacker Escape Challenges! Are you ready to dive into the thrilling world of Minecraft with the exciting challenge of escaping from various mazes as a Noob, Pro, or Hacker? Join the adventure as you navigate through intricate labyrinths filled with surprises and obstacles! Experience the Thrill of Escape Challenges Embark on a heart-pounding journey as you attempt to escape from challenging mazes designed for players of different skill levels. Whether you’re a beginner (Noob), an experienced player (Pro), or a master strategist (Hacker), each level presents unique hurdles to overcome…. Read More

  • Bamboo Farm Shenanigans

    Bamboo Farm Shenanigans Minecraft Episode 39: Building Bamboo Farm Panda Survival Series Minecraft, a world of endless possibilities, where blocks, creatures, and community come together to create a unique gaming experience. In this episode, @swordartgamer1063 takes on the challenge of building a bamboo farm in the Panda Survival Series. Let’s dive into the exciting world of Minecraft! Building a Bamboo Farm One of the key elements in Minecraft is resource gathering. Bamboo is a versatile material that can be used for various purposes, from crafting tools to feeding pandas. By building a bamboo farm, players can ensure a steady supply of this… Read More

  • Town Center Construction Chaos

    Town Center Construction Chaos Minecraft – Daily Survival Let’s Play: Town Center Construction Begins Welcome to a daily Minecraft adventure with Solomon Leatherland, also known as PCG (PowerChordGames) on YouTube. In this vanilla survival let’s play series, Solomon embarks on a journey to build, explore, and collect all 145 trophies available for the game on PS4/PS5. The twist? Natural regeneration is turned off, adding an extra layer of challenge to the gameplay. A Relaxing Escape Unlike many high-energy gaming channels, Solomon’s approach is refreshingly calm and laid-back. Each episode offers a chance for viewers to unwind, chat, and enjoy some relaxing gameplay. The… Read More

  • Crafty Minecraft 1.21.x Modding Update

    Crafty Minecraft 1.21.x Modding Update Welcome to Minecraft 1.21 Update! Minecraft enthusiasts, rejoice! The highly anticipated Minecraft 1.21 update is finally here, and with it comes a host of exciting new features and changes. One of the most significant updates is the overhaul of how Identifiers are defined, bringing a fresh perspective to the game’s mechanics. Additionally, enchanting has undergone substantial modifications, adding a new layer of depth to gameplay. Understanding Minecraft Update With the release of Minecraft 1.21, players are eager to delve into the world of Fabric modding. The Fabric mod loader has been updated to accommodate the latest version, offering developers… Read More

  • Temple Discovery: Minecraft’s Newbie Adventure

    Temple Discovery: Minecraft's Newbie Adventure In Minecraft’s world, I take my first step, Exploring the temple, where secrets are kept. Crafting tools and battling foes, Unraveling puzzles, as my journey grows. Join me on this adventure, full of delight, As I navigate through day and night. Will I conquer the temple’s test, Or will I face a challenging quest? Follow me on Instagram, for more fun, Watch my video, see what I’ve done. Subscribe to my channel, don’t miss a beat, In the world of Minecraft, where victory is sweet. Read More

  • Horror Minecraft Mods: Download at Your Own Risk!

    Horror Minecraft Mods: Download at Your Own Risk! In the world of Minecraft, where blocks reign supreme, There’s a dark side lurking, a player’s worst dream. Horror mods and maps, they’ll give you a fright, But don’t download them, not even at night. Daosao Gamers, they bring the scare, With gameplay videos, they’ll make you aware. From jump scares to monsters, they’ll keep you on edge, But remember, it’s all just for fun, no need to pledge. So like, share, comment, and subscribe, For more Minecraft horror, they’ll surely provide. But be warned, the fear is real, so hold on tight, As Daosao Gamers take you on… Read More

  • Crafting Cuts: Minecraft Makeover Madness

    Crafting Cuts: Minecraft Makeover Madness In the world of Minecraft, a haircut’s the talk, Before and after, the difference is stark. From looking messy to looking so clean, The reactions are wild, a hilarious scene. No diamonds in sight, just a mining quest, But keep on searching, you’ll find the best. A sister’s late for school, a brother’s in a rush, The chaos ensues, oh what a hush. Dad’s off to play cards, mom’s coming home, The homework’s not done, oh what a tone. But in the game, there’s always a surprise, A secret passage, right before your eyes. With 3D drops and diamond… Read More

  • Talking Angela 2: Zoonomaly vs Minecraft vs Angela Cosplay

    Talking Angela 2: Zoonomaly vs Minecraft vs Angela Cosplay In My Talking Angela 2, a cosplay showdown brews, Angel Angela versus Devil Angela, who will you choose? Witness the clash of good and evil, so divine, As Angela transforms, her powers intertwine. With radiant wings, Angel Angela takes flight, Her grace and charm, a heavenly sight. But beware of Devil Angela, with fiery wings ablaze, Her mischievous energy, in a darkened haze. Join Angela on her journey, through challenges untold, In a world where light and darkness unfold. Will purity prevail, or mischief take the lead? Watch the battle unfold, in this epic deed. Subscribe, like, and hit… Read More

  • Noob vs Pro: Rocket House Build Off in Minecraft!

    Noob vs Pro: Rocket House Build Off in Minecraft!Video Information heute bauen wir ein raketenhaus in Minecraft um meine und Nellys Familie zu retten können wir die Raketen rechtzeitig bauen bevor die Sonne untergeht und die Armee der Zombies kommt keine Sorge meine Freunde ich werde euch in Sicherheit bringen ich werde euch ein Ausweg bauen h warte mal Nelly aus was baust du bitte sehr die Rakete h ich benutze Erde natürlich warte mal Nelly du benutzt Erde das ist eine schreckliche Idee ich werde pro Blöcke benutzen nicht so wie du keine Sorge Freunde ich werde euch schon alle in Sicherheit bringen als allererstes bauen wir… Read More

  • SUPER TINY HOUSE in Minecraft!

    SUPER TINY HOUSE in Minecraft!Video Information This video, titled ‘Minecraft’s Smallest 🏠🏡’, was uploaded by Abheroism on 2024-04-25 11:27:25. It has garnered 637 views and 28 likes. The duration of the video is 00:00:43 or 43 seconds. Minecraft’s Smallest House �� Don’t Forgot to Subscribe ���� Read More

  • Limited Lives SMP

    Limited Lives SMPThis is a fun lifesteal server without many rules! No cheating/exploiting and no spawn killing. This will be a fun hangout spot with many o wars and killings. 50.20.204.189 Read More

  • Salve Craft SMP – whitelist – Java 1.20.4 – Community-Focused

    Welcome to SalveCraft Community! I am the owner of a server and community called SalveCraft. We have a diverse group of players from around the world, with different talents in Minecraft. Our server operates similar to HermitCraft, focusing on creating a fun and entertaining environment for all. Currently Playing: Season 2 (Started 05/20/24) Curious about life on the server? Watch our intro video: Intro Video If you’re interested in joining us, feel free to reach out or join our Discord server for more information. (Application process through Google Form) Google Form: Application Form We look forward to welcoming you to… Read More

  • Homie’s SMP rebooted

    Season 2 [​LIFESTEAL]. Welcome to the HSMP server we are a new server that just started and we are needing more people to join so come right down and meet us and have fun also we have a discord server Read More

  • Minecraft Memes – BLOODY MINECRAFT TRAILER SPOTTED?!

    Minecraft Memes - BLOODY MINECRAFT TRAILER SPOTTED?!Looks like Minecraft needs to step up their game and add some serious adult themes, like paying taxes and dealing with existential crises. Read More

  • Minecraft: Wi-Fi TNT Explosion! 🔥

    Minecraft: Wi-Fi TNT Explosion! 🔥 “When your Wi-Fi is so bad even the TNT in Minecraft can’t handle it and explodes with different signals trying to connect.” #Wi-Fail #MinecraftProblems 😂🔥📶 Read More

  • Join Minewind Server for Epic Minecraft Builds!

    Join Minewind Server for Epic Minecraft Builds! Welcome to NewsMinecraft.com! Are you a fan of creative Minecraft building tutorials like the one you just watched on how to build a Strawberry Hot Air Balloon? If so, then you definitely need to check out Minewind Minecraft Server. Minewind offers a unique and exciting gameplay experience for all Minecraft enthusiasts. With a focus on survival gameplay and an emphasis on player interaction, Minewind is the perfect place to showcase your building skills and creativity. Join players from around the world on Minewind and explore a vast open world filled with endless possibilities. Whether you’re a seasoned builder or… Read More

  • POMNI CAUGHT CHEATING IN BLOCK BUDDIES!

    POMNI CAUGHT CHEATING IN BLOCK BUDDIES!Video Information today we’re doing a build to survive with the smiling Critters and the amazing digital circus you got to be kidding me right now again with these circus Freaks Come on okay that was just a bit rude I remember the good old days where you guys didn’t exist what do you mean come on guys we all got to work together as a team The Smiling creators and the amazing digital circus I’m not going to work with these losers okay um oh oh come on uh well I’ll just keep my distance I guess no come… Read More

  • Survive with Friends: Immortal Op’s Chill Minecraft Stream

    Survive with Friends: Immortal Op's Chill Minecraft StreamVideo Information गाली खाने ल क्यों करता है मो अगर तूने एक नहीं सुनूंगा अगर तूने एक और बार बोला तो तू किक खाएगा तो फिर मैं कैसे बताऊ जो मेरे को बोलना है अगर तूने एक और बार वो बात रिपीट करी तो समझ जा बेटा तू गाली खाएगा कुछ नहीं तू चूया आदमी है त गे है भाई तू लेसन है तू एलजीबीटी मेंबर त एलजीबीटी क सपोर्टर है तू ले है तू गे है तू रेनबो फ्लैग है तू एलजीबीटी क का फ्लैग है तू तू तू तू तू आथ है तू भाई आथ आथ आथ आथ… Read More

  • EPIC Minecraft Livestream ft. Fans 🔥

    EPIC Minecraft Livestream ft. Fans 🔥Video Information basically looking for a maybe not a city I don’t know what to call it guys hold when are you going to make the hi Anarchy Peace restorers On Top e all right chat I’m basically looking for a civilization map it’s uh doesn’t seem to be working right now but we looking hopefully we do find one though hopefully we do find one hey why is it only for Java Edition Bru what is this can I join yeah anybody can join and R key peace restored on top hi Jam what’s up Mr crabes now course… Read More

  • Unbelievable Boss Battle in Minecraft’s Kitatcho Labs!

    Unbelievable Boss Battle in Minecraft's Kitatcho Labs!Video Information Chambre 08 – Prolongation *La sortie est ouverte* *La sortie est fermée* Chambre 08 – Prolongation *La sortie est ouverte* Chambre 08 – Prolongation *La sortie est fermée* Chambre 08 – Prolongation *La sortie est ouverte* LE SAVIEZ-VOUS?: BOULES JETABLES(TM)/-C’est la boule qui a le plus grand nombre d’applications!/-C’est le cousin éloigné de la "Boule de slime glacée."/-Les Boules jetables(TM) sont sensibles à 2.78%!/-Quatre fois gagnante du prix Kitatcho de la Boule de slime la plus vendue!/- Les Boules jetables(TM) ne collent pas aux surfaces et tombent. N’essayez pas de le faire./-En fait, le bleu est une couleur… Read More

  • Insane Minecraft Survival Live! 😱 Road to 700 Subs! #Problizz

    Insane Minecraft Survival Live! 😱 Road to 700 Subs! #ProblizzVideo Information This video, titled ‘Minecraft Survival Live!!!😉😉 || Road to 700 Subscribers!! || #Problizz #live #verticalstream’, was uploaded by Problizz Animations on 2024-04-29 22:41:40. It has garnered 522 views and 2 likes. The duration of the video is 01:54:37 or 6877 seconds. 🌈 Thanks for Tuning In to Problizz Animations! 🌟 If you’ve made it here, you’ve just experienced a whirlwind of creativity and entertainment! 🚀 We hope our animation took you on an unforgettable journey filled with laughs, thrills, and maybe even a sprinkle of magic. 🎨✨ 👍 Loved what you saw? Don’t forget to hit that Like… Read More

  • EPIC Survival Challenge in Minecraft Oneblock 😵 | Ft. Friends

    EPIC Survival Challenge in Minecraft Oneblock 😵 | Ft. FriendsVideo Information [संगीत] हम सरवाइव करने वाले 15 डेज इस मा फन ब्लॉक पे और यहां पर दूर दूर तक कुछ नहीं है और आज के इस वाली वीडियो में हमारे दो गोल है एक घर बनाना और इस गेम को बीट करना चलो तो जैसे ही मैं गेम के अंदर आया मैंने देखा इसमें एक ब्लॉक पहले से ही साइड में लगा है फर्स्ट ऑफ अपनी सेफ्टी के लिए क्योंकि यह मोड बहुत अच्छा है आपको चाहिए तो कमेंट कर देना मैं दे दूंगा और जैसे मैंने फिर ब्लॉक पर आया और मैं मैंने मांग स्टार्ट कर द फटाफट… Read More

  • Insane Gameplay: Valorant and Minecraft Madness!

    Insane Gameplay: Valorant and Minecraft Madness!Video Information stream I don’t know what I’m doing here I am um playing Minecraft but Minecraft isn’t loading which is crazy but sure Minecraft load thank [Music] you hold up I’m wait still working on my stream uh hello or whatever other language you speak we’re playing m today because you know Minecraft is a game I play sometimes anyways stream say hello to this guy I’m playing with I it’s restarted so I’m yeah he’s there anyways he’s he’s there yeah he’s there don’t worry about him he’s a little special n if anyone if anyone’s special it’s… Read More

  • EPIC Minecraft BUS HAUS Challenge: NOOB vs PRO!

    EPIC Minecraft BUS HAUS Challenge: NOOB vs PRO!Video Information heute machen wir eine riesen busbauchallenge aber ich werde heimlich Hacks benutzen um meine Freunde le Lucky und B komplett zu Pranken mit dem hackerk slash/p wer ich einen riesigen Bus spawnen aber was meine Freunde auch nicht wissen dass ich die coolsten tnts am Schluss aller Zeiten benutzen werde der ganze Bus wird am Schluss mit dem krassten TNT in die Luft gejagt und jetzt l uns schauen wo unsere Freunde stecken so Freunde seid ihr bereit für die coolste bauchallenge aller Zeiten heute machen wir eine wer weiß es wer weiß es wer weiß es Challenge… Read More

  • Bingus goes bananas in Minecraft Hive w/baby monke & chaos!

    Bingus goes bananas in Minecraft Hive w/baby monke & chaos!Video Information [Music] hello guys right okay I I apologize for everything that is going wrong here it seems like everything’s kind of glitching out but um I hope you guys are having a good day today is Sunday we’re kind of vibing I was doing a bit of background things I’m trying to get points to work at the moment uh with streamlabs but streamlabs decided not to work so instead of um everything going smoothly like I wanted you got stuff like that so uh we’re going to we’re going to move on from that now though we’re… Read More

  • AspectSMP

    AspectSMPDM me on discord so i can whitelist you, my username in MrAspect, The server is on 1.19.4 and it is on Java. and also join the discord server. DISCORD RULES: NO Bullying, Racism, Sexism, NO Disturbing pictures/videos NO Spaming NO Self promos (unless in self promo channel) NO Threats. SERVER RULES: NO greefing NO racism NO being mean aspectyes.apexmc.co Read More

  • YourVillageSMP – smp, Towns, Unique world gen, Web map

    Welcome to Our Minecraft Server! Experience a world like never before with unique biomes, over 200 custom structures, and persistent progress. No resets, no wipes—just endless exploration and building. Defeat custom bosses for valuable rewards and forge crate keys in-game. Our town system is user-friendly and offers a seamless experience. Join us on Saturday, June 15 for an unforgettable adventure! Join Our Community: Discord: discord.urvillage.net Map: map.urvillage.net Teaser: TikTok Read More

  • Minecraft Memes – They’re coming…RUN!

    Looks like those creepers are sneaking up on them faster than they can calculate their meme scores! Read More

  • Crafting Chaos: Minecraft Meets Exam Eve

    Crafting Chaos: Minecraft Meets Exam Eve In the world of Minecraft, I’m the news reporter in rhyme, Crafting updates with humor, each one a good time. From new mobs to blocks, I’ll keep you in the know, With a spin and a grin, I’ll make sure it’s a show. So leap into the verse, where the truth takes wing, In every pulsing line, let the story sing. I’ll keep it fierce and funny, engaging and light, With Minecraft facts to share, each one a delight. From Cube Xuan to Classroom Series, I’ll keep you entertained, With rhymes that ignite, each update explained. So follow along,… Read More

  • Crafting a spicy Minecraft meme

    Crafting a spicy Minecraft meme Why did the creeper break up with his girlfriend? Because she kept blowing up at him! #minecraft #memes #meme Read More

  • Unleash Chaos: Zman’s Epic Minecraft Adventure!

    Unleash Chaos: Zman's Epic Minecraft Adventure!Video Information we’re [Music] live yes we are live we are indeed live all right this thing is just being stupid here but we are live any who what’s going on everybody hope you’re having a wonderful Wednesday evening it’s JY and today we are playing some aged Minecraft mod pack as I I forget where I was going with that but by request I should say that’s what I was trying to say some people were wanting to see me go ahead and play this mod packs once again so why not go ahead and jump right into it… Read More

  • Insane MLG Plays in Minecraft?

    Insane MLG Plays in Minecraft?Video Information This video, titled ‘MLG in minecraft #2 #minecraft #shorts #trending #viral #trendingshorts #viralvideo’, was uploaded by Storm Light on 2024-05-09 06:55:23. It has garnered views and [vid_likes] likes. The duration of the video is or seconds. MLG in minecraft #2 #minecraft #shorts #trending #viral #trendingshorts #viralvideo minecraft shorts animation,minecraft aesthetic … Read More

  • Hermitcraft 10: Scar Gets Roasted for 3 Hrs!

    Hermitcraft 10: Scar Gets Roasted for 3 Hrs!Video Information it’s time here we go guys are we ready are we ready are we ready are we ready to move the mobs to the nether through a portal and not die in an endless death Loop why is Minecraft not showing Again Minecraft why are you being elusive every stream you don’t show up for me come on show up for me today I need you it’s going to be a dangerous it’s going to be a deadly day thank you Minecraft showed up for us today big day important day remember guys there’s going be some big… Read More

  • INSANE Adventure in Minecraft Ep 1!

    INSANE Adventure in Minecraft Ep 1!Video Information we’re back guys so I’m just going to spawn like I’m I made like I’m still working on that so one two three four four five six seven okay I’m going to spawn seven Villagers 4 five 6 7 just to make [Music] sure okay want to know what what those Iron Golems are doing down there there’s four of them unless one dies so I’m going to replace the chest there nope no wait so I go so I’m I’m going to think I’m going to go [Music] upstairs and then remove the remaining blocks there and… Read More

  • Gaming Marathon: 48-Hour Minecraft Stream Challenge!

    Gaming Marathon: 48-Hour Minecraft Stream Challenge!Video Information working now I just need to change the window capture or I need to change it to game capture only problem is sometimes like Minecraft that’s a weird thing uh with a full screen that I don’t like so that’s sometimes like uh it’s like alt tabbing I don’t remember man usually I play in window just because of that and I don’t really notice the bar uh how do I do it I forget how to do it because I have OptiFine installed and it’s different I don’t like change it’s different you don’t like it because… Read More

Reinforcement learning in Minecraft: Challenges and opportunities in multiplayer games