
There is a joke that cracks me up:
"Did you know that, before the clock was invented, people had to actively roam around and ask people the time?"
There is obviously no need to explain the joke, but if we were to overthink it a little bit (like good mathematicians do) we can say that the joke is about the fact that the information of a particle of a group can be used to inform all the other particles. This concept is actually way deeper than the joke I just said and can be exploited further.
Let’s consider a self-organized system, such as bird flocking or fish schooling. We can define this system as one made of particles (e.g. a particle is a bird). We can also assume with a good degree of approximation that these particles move around in space adjusting their positions based on two factors:
- The best position that the specific particle knows: what the bird thinks is best for themselves.
- The global best position that is given by all the particles "communicating" with each other: what the bird is instructed to do by the "main bird"
Now, what is "best" in Nature? What is the best thing to do for the bird? What is the best for the "swarm"? I’m absolutely not the right person to ask this, as I really have no idea. What I DO know is that by observing this kind of behavior in Nature, we are able to formalize a very fascinating optimization algorithm. In other words, if we do define what is best, then we can use this evolutionary approach to optimize the function that we selected.
This algorithm is known as Particle Swarm Optimization (PSO). I know, this is a pretty big leap. What is "optimization"? Why are we talking about math all of a sudden? What is that we are optimizing? Throughout this article, I will try to cover all these steps and more importantly, we will use object-based programming in Python to create our own ParticleSwarmOptimizer() class. In other words, we will cover the PSO world from theory to practice.
Let’s get started! 🐦🐦🐦
0. Introducing "Optimization"
I have the feeling that if you are reading about "Particle Swarm Optimization" maybe you already know a little bit about "optimization" and you don’t know about the "particle swarm" thing, so I don’t want to spend too much time on this and bore the readers.
What I do want to know is to give a little bit of a formalization to the problem. As we said earlier, mathematically, what is the "best thing to do for a swarm"?
Well, mathematically, the "birds of a swarm" (or "particles") are points in a K-dimensional space. If it is the real space, our "domain" can be the 3D space we live in, but it can indeed be much larger than that. Actually, this method gets interesting when we forget about the "bird" example and we consider the fact that we can use this method, in general, to find the minimum or maximum of a given function.
For example, let’s say that you have that the price of a house is automatically generated exclusively from an algorithm (veeery dangerous not to keep a person in the loop but anyway):

Now this doesn’t much make sense, right? What is the "House" Icon? We will have a list of features, for example, locations, size, etc…

Now this makes sense. The house is translated to the feature space. So each house is a k dimensional point. The "process" that converts the Feature space into a house price is a "black box" function:

The real game here is this:
"What is the x that I need to have to get the minimum cost?"
This is our optimization problem.

The end game of PSO will be the coordinates of x_min. Let’s see how we deal with it.
1. Introduction to Particle Swarm Optimization
Before we start, I want to say that it’s extremely helpful to look at the original paper. I would say that it’s one of the least-technical technical papers I have ever read. It is a very easy read and, at the same time, it is very informative and fun.
Alright, so, what is the idea behind this? As we said, the idea is that we are creating a set of particles that collaborate to find the global minimum by exchanging the information that they have. If you want, it is very similar to what happens to shows like "Stranger Things" where there is an enemy to beat and all the characters work together and talk with each other using all the means that they have.
More technically, we start with num_particles particles. These particles have random locations, and for each one of these locations the cost function L will have its value. For example, particle 1 will have location x_1 and cost L(x_1). This is iteration 0, and "after" this first iteration we have built our system. Now the system needs to evolve. How do we do this? With a quantity called "velocity". We say, that for the particle x, for each dimension (i) and each iteration (j), we have that:

Or in the more elegant vector form:

Now how do we define v? That is where the beauty of this method comes in. The vector v has basically three components:

Let’s explore them:
- v_intertial is basically the memory of the previous velocity. The term comes from physics where the inertial system is a system of reference where a body will remain at rest or move at a constant velocity.

where k_intertial is a constant.
- v_cognitive is the velocity that is suggested by the own particle (that is in some way cognitive). It suggests that we should move in the direction where we know we have our best for that particle. We also add a little bit of randomness r_1 (random number):

k_cognitive is again a constant. x_best is the best (lowest cost function) for the x particle.
- v_social is gathering the information from all the other particles. We move in the direction that is the best for all of the particles.

k_social is a constant, r2 is a random value and x{global best} is the best (lowest cost function) for all the particles.
Now, that is, I think, extremely elegant. Let’s wrap it up:
- We start by selecting a certain number of random particles (i.e. particles in random locations)
- We move these particles based on the velocity parameters
- These velocities are given by three factors: an inertia, which keeps the memory of the previous velocities, a cognitive velocity, which keeps us moving in the direction that is the best for the particle, and a social velocity, which keeps us moving in the direction that is suggested by the whole group of particles.
- We keep changing the location of these particles for every iteration until we get to num_iteration. Then we select our best option.
Ok, enough words for today. Let’s run some Python 🙂
2. PSO Implementation
We will use two classes to build our PSO. The first one is the class of a single particle, and the other class builds, given the single particle class as a builder, our swarm.
The first thing I want you to build is a constants.py file. This one keeps all the default info, like the boundaries of the particle, number of iterations, and all the k values. How do you change this? You need to create a .json file and put it in the name "_config/pso_config.json_". If you don’t have one, the script I will provide in the classes will build it for you.
Now, everything we said in our code is inside a .py file that I called particle_swarm_optimizer.py
Let me describe it to you in a few words:
- Particle() builds a single particle. Starts with random location and then has the update_velocity() and update_position() features to update the velocities and positions based on what we said above (you can check to make sure I’m not lying).
- ParticleSwarmOptimizer() takes two objects: the Particle() class, to build a class (you can make the algorithm more complex if you want by adding features to the particle or modifying the update features), the objective_function is the black-box function that we want to minimize.
Now, the star of the show is ParticleSwarmOptimizer() of course. With optimize() we are going to run the optimization process. We can select if we want to print the results (iteration per iteration) and if we want the solution array (again, iteration per iteration). plot_pso_convergence() will help us in the plotting stage. If you set animated=True you **** will have the .GIF of the convergence, meaning that you will see the global minimum move, iteration per iteration (very cool stuff, you’ll see it in a second).
3. Hands On Examples
We did all the dirty work before, now we can run everything in a very few lines. If we select a very simple quadratic function:

This is the .GIF we generate:

Now PSO does a very good job even with fairly complex objective functions like:

Running the same code above we get:
This is the .GIF:

So we can see how our little PSO doesn’t only get to the minimum in cases where there is one evident minimum, but it gets to the global (we believe, as we can’t tell analytically) minimum even for cases that are much more complex and have multiple local minima.
4. Conclusions
Thank you very much for spending time with me in this optimization process. I hope I maximized your knowledge given the parameters of this being a Towards Data Science blog post. 🙂
In this article, we did the following:
- We described the idea of working on a problem by analyzing the information of multiple particles at once
- We described the "optimization" task with the very simple use cases of the house price minimization.
- We talked about Particle Swarm Optimization (PSO) in general terms and talked about the fascinating idea behind it
- We went into the equation world, and we saw how the algorithm evolves an initially random state for num_particles particles and gets to the best estimate of the global minimum
- We used object-based programming to implement PSO from scratch.
- We tested it on a very simple quadratic problem and on a much more complex multi-minima problem. In both cases, we saw very promising results!
5. About me!
Thank you again for your time. It means a lot ❤
My name is Piero Paialunga and I’m this guy here:

Image made by author
I am a Ph.D. candidate at the University of Cincinnati Aerospace Engineering Department and a Machine Learning Engineer for Gen Nine. I talk about AI, and Machine Learning in my blog posts and on Linkedin. If you liked the article and want to know more about machine learning and follow my studies you can:
A. Follow me on Linkedin, where I publish all my stories B. Subscribe to my newsletter. It will keep you updated about new stories and give you the chance to text me to receive all the corrections or doubts you may have. C. Become a referred member, so you won’t have any "maximum number of stories for the month" and you can read whatever I (and thousands of other Machine Learning and Data Science top writers) write about the newest technology available. D. Want to work with me? Check my rates and projects on Upwork!
If you want to ask me questions or start a collaboration, leave a message here or on Linkedin: