How CAPTCHA is not important? but finding everyone is!
The Election commission thought CAPTCHA should be plain numbers
The Election Commission Of India
The Elections in India are conducted by this gigantic body of personnels and machines co-ordinating together to form the Election Commission.
The Wiki will tell you that this is an old autonomous body for administrating elections in India. However, The Eleciton Commission also deals with maintaining and providing a large number of data-points generated during the elections.
One such point for voters and power users is the Voter Infomation website which very usefuly provides the option to search for a voter's information.
Now in a normal case, when you might want to see your voter information, you can just simply fill the form and hit that sweet search button. But let's say you want to scan and maybe horde information about select individuals or groups meeting your specific requirements, what will you do then?
A possible solution could be going to the specific CEO website and download the election pdf rolls and search through the poorly formatted and recorded files.
A better solution however, could be simply spamming the main Electoral Search website!
So let's see, this is the website with its quite simplistic CAPTCHA model:
All the supposedly secure CAPCTHAs generated on the site just consist of simple numbers ranging from 0 to 9. Now this problem can be easily solved by using AI right? Let's just spin out tensorflow and train ourselves a new model!
Not so fast buddy, as cool as that may be, we really wouldn't want to waste our time building an A.I. for such a "secure" CAPTCHA system, would we?
Instead I give you the ever-useful pyautogui with its ever-more useful functions:
We'll just use the: locateOnScreen(image, grayscale=False) function for our menial job. This means that we can store all the possible values and store them as seperate images and use them to identify the numbers on the screen?! Yes and no. Finding numbers are easy but categorizing a whole CAPTCHA sequence in its right form is not.
For Example, the CAPTCHA might be: 1 4 4 6 2
Now you'd surely find 1,4,6 and 2 but what about the order and accounting for repeating digits? Answer? Just looking at the function we are using more closely!
locateOnScreen(image, grayscale=False) #returns the coordinates of the image found!
Taking those coordinates and simply by arranging them left-to-right we get all the CAPTCHA sequences in order.
Here's the script:
import pyautogui as do
def getLeftValue(image):
try:
leftValue = [do.locateOnScreen(
'./CAPTCHA/voterinfo/'
+str(image)+'.png')][0][0]
except:
leftValue = 0
return leftValue
def getCaptcha():
images = [1,2,3,4,5,6,7,8,9]
captcha = {}
for image in images:
leftValue = getLeftValue(image)
if leftValue != 0:
l = [(leftValue, image)]
captcha.update(l)
return ''.join(map(str,[captcha[key] for key in sorted(captcha)]))
captcha = getCaptcha()
print "The CAPTCHA IS: ", captcha
And voila!
Now you can rope in the benefits of programmatic spamming!
NOTE: To be serious this type of CAPTCHA systems should be removed and replaced as soon as possible by the authorities.
An example of how bad things are for the Commission is the following at the time of writing this post:
No Searchable Facility
- Assam(nsvp.in)
- Chandigarh(digitalindia.gov.in)
- Meghalaya(nvsp.in)
No Captcha
- Karnataka
- Uttar Pradesh
- Uttarakhand
- Andra Pradesh
- Arunachal Pradesh
- Dadra and Nager Haveli
- Goa
- Haryana
- Jharkhand
- Kerla
- Lakshadweep
- Madhya Pradesh
- Maharastra
- Mizoram
- Nagaland
- Odisha
- Rajasthan
- Sikkim
- Telangana
- Tripura
- West Bengal
Plain Digit Captcha
- Himachal
- Delhi
Has Captcha (Can be broken)
- Bihar
- J & K
- Chattisgarh(only Hindi version is available)
- Gujrat
- Manipur(dotted foreground)
Un-reachable
- Andaman & Nicobar Islands
- Daman & Diu
- Puducherry
- Tamil Nadu
The Entire Captcha is text itself
- Punjab (Way to go Punjab)