- How These Links Manage to Generate $60k+/month
The secret art of what I like to call… “Brute Scraping”.
Just now·8 min read
It’s 2018 and “hypebeast” culture is at its peak. As a young 17-year-old, I buy into it and immediately fall in love with what it has to offer. But what interested me the most was the billion-dollar sneaker resale market. Me and my friend pool together $300 to invest in our first sneaker bot (scripts that will simulate customer checkouts but entirely through requests), but as I continue to explore the niche sneaker reselling community I began to realize there's even more unrealized profit in the scene. Within a couple of months, I had figured out botting and had built connections with local sneaker boutiques around my area and I was acquiring coveted sneaker releases weeks before they were publically available.
Okay, but what if you didn’t know an employee or manager at a sneaker boutique? How could you improve your chances of securing a limited edition pair of sneakers through online purchases? This is where the power of web scraping and python comes in. This isn’t just your normal requests.get to a product page and then using BeatifulSoup to parse the HTML. We wanted product links before they were even live on the frontend. Having the link or the product ID before anyone else gives you a significant advantage because your sneaker bot can use the product ID to skip the requests required to find the product, while everyone else's bot is still sending requests to the products endpoint on the API, and waiting to find a product title matching their set of keywords. This advantage means you get to purchase more pairs, netting you a greater profit than your competition.
So, where do you find these links? I have them. And I sold them, along with a multitude of other resources to help you with sneaker botting and reselling for $50/month. The companies I ran are what the sneaker community calls “cook groups”, where cooking implies checking out the most sneakers you can on a release day. Although my companies were limited to around two to three hundred members during a single month, there were other companies doing the same thing that currently have over 2000 active subscribers, charging up to $60/month, raking in $60,000–120,000 a month.
A table calculating the net monthly revenue for the top cook groups in the sneaker community. Source: https://twitter.com/ethanzolla/status/1349146956023517185
Now that we understand the importance of these early links, and why they are so valuable let's get into a basic understanding of how to acquire them. One way could be a connection with an employee at a boutique, but where’s the fun in that? If you’re a web scraper you’re pretty familiar with throwing a request into a while true loop and adding a proxy library to prevent getting rate limited in order to extract information from a live page at any given time. But how do you get information from a page that is not live yet? The answer is patterns.
Example: Ralph Lauren
Sweater purchased using the early link, and its current value.
Let’s start off with something easy first. When Ralph Lauren announced that they would be releasing this sweater on their site, I immediately began doing research to figure out how their site worked. Scrolling through the latest products I began to notice the incrementing product IDS (PID for short). I threw the latest PID into a python script and came up with the script below:
19 lines of code. That is all it took for me to profit $1500. Because I noticed the PIDs were increasing, I decided to test the ones that came after the latest PID that was already live on the site. Although most requests would return a 404 status code, there were some that would return a 200, and the URL you were redirected to included ‘hidden’ in the URL, and the full product name as well. Going to the URL on your browser showed an empty product page, but after the release was live, with one refresh you were already on the product page and did not have to do any searching. I ended up open-sourcing this script to introduce people to the basics of getting early links since there was such a small chance I would end up needing a scraper for Ralph Lauren again for my business.
This explains the basic premise of “brute scraping”. Like brute-forcing, you rapidly check hundreds of thousands of combinations in hopes of a response, but instead of trying to find someone's password, you’re trying to find a link to a sneaker or coveted clothing item that could be worth thousands on the secondary market.
Example 2: Kith
Kith is one of streetwear's biggest companies. They release huge collaborations and hyped sneaker drops. The method I’m about to reveal has been patched by Shopify (the eCommerce solution used to host the Kith site). The majority of bots used by sneaker resellers support Shopify (meaning they will automatically checkout products on Shopify stores for you), so having a reliable source for Kith early links was important.
Links posted for my subscribers two days before they were live on Friday 1/18/20.
Testing https://kith.com/products/sneaker-SKU and seeing if the response URL would change into a link with the name of the sneaker instead would mean that that was the correct link for the sneaker release. But what about Kith clothing releases? How are we supposed to know the SKU of a Kith product? Well like I said before the answer is patterns. A typical kith SKU looks like this: KH6193–301. I made a script that would recognize similarities between links, and what I noticed was that all of the SKU’s for tee shirts on Kith’s website began with KH31, KH33, KH34, KH35, and KH36. For jackets, they began with KH10 and KH11. And for pants, they began with KH61. Now, how do indicate the style of that item? The style was the number after the dash. Black was 100, white was 101, and navy was 102. That’s 7/9 characters figured out for a Kith SKU. All that's left is to write a script to brute force SKU’s with the acquired information and to watch the magic unfold. You’ve seen Kith’s recent post on Instagram, and you know they are releasing a White Tee shirt, so you plug in KH31 and 102 into the script, and you get a link returning a URL with a name that aligns with the one in the Instagram post.
These were the SKU patterns for 2019. It may or may not have changed by now.
Example 3: Other Shopify
The final example I’d like to end off this article with is other Shopify sites. This was the final boss when it came to scraping early links. A Shopify variant is the 14 digit number representing the size, color, and name of a product. For example, when you select an option for a size or color on the product page of a site hosted with Shopify, you will notice ?variant= gets appended to the current link. With Shopify, you can add a product to your cart, without going to the product page. All you need is the variant and their cart API URL. For example, going to this link (may or may not work depending on Shopify changes) will return JSON data regarding the variant that has been added to your cart. You get everything with this link from the size, the color this variant represented, the original product URL, to the product name and title. The best part is this link worked with variants that were added in the store's backend dashboard but weren’t live yet on the frontend. So how do we use this information to gather early information? It’s impossible to brute force 14 digits within a small period of time, and it would take an absurd amount of proxies to make the required requests to brute force the variant you want to find.
But once again. There was a pattern. Every Shopify variant is divisible by 32768. Every consecutive variant in a variant group (the sizes/color variations for a product) has a difference of 32768. But there could be any amount of gap between variant groups (but it was still a multiple of 32768). And new variants added to a store were larger than the ones before. Okay great, this is all really good information, but we still have to brute force a lot of digits. We could keep adding 32768 to the latest variant on the store and keep making requests to the cart URL, but it would still take tens of thousands of requests before you landed on a real variant. The trick? Skip. If each sneaker has an estimated 15 sizes, then we could increment the variants we test by 491520 (32768 * 15) instead. If we get a response, we stop, and backtrack by 32768 to make sure we didn’t miss any, and then step forward again until we don’t get a response to get to the end of the variant group. Now you have the product name, URL, and all the sizes and variants of a product that isn’t live yet.
Early product links and cart links were given to my members.
After 2 years in the sneaker scene, I had decided to take time off to focus on school and taking my career as a Full Stack Developer further. I love beautiful design, and the ability to express yourself through creativity, not scalping sneakers and reselling them. Even though it was profitable, it’s never fun to keep doing something you don’t really enjoy.
I have no idea if the final method still works, and I have no idea if that's what the companies that sell early links are doing. That was just my personal method, and it may even look really inefficient to the companies I was competing against. Before I shut down my company, I was messing with Shopify’s GraphQL API and found some interesting things, but never completely figured it out. Overall, it was a really fun learning experience, and it was really cool to understand and find the weakness’ behind certain APIs, and overall it would lead to me building my own APIs that are able to protect themselves from these exploits and issues.
The legality behind these things are still pretty unclear, Shopify CTO Jean-Michel Lemieux often shares banter with the developers of sneaker bots and people doing the same thing as me, and he seems to even encourage it. So proceed with your own risk.
Thank you for reading my first Medium article. I hope this was an entertaining and interesting read, and I hope there was something to take away, whether you’re running an eCommerce site, or you’re an aspiring sneaker reseller looking to learn how to do things like this.
- Date of publication:
- Tue, 02/23/2021 - 15:23
Click on the link - it will be copied to clipboard