ReCAPTCHA is yet another free-of-charge product offered benevolently by Google for any webmaster to implement within their own services. How does ReCAPTCHA differentiate legitimate human users from bots? ReCAPTCHA relies extensively on user fingerprinting, putting emphasis on the question of “Which human is this user?” rather than the ordinary “Is this user human?”. It’s worth noting how much easier it is to successfully solve ReCAPTCHAs when the user is logged into their Google account, thus allowing Google to associate their actions with their real identity. A similar effect is often reported for users of non-Google browsers, who notice ReCAPTCHAs take more time to complete in Firefox over Chrome. This is in-line with many other anti-competitive techniques that Google has used over the years to help grow their market share.
Although determining exactly how ReCAPTCHA works is very difficult, with Google not only heavily obfuscating its JavaScript, but also implementing an entire VM in JavaScript with its own bytecode language, there have still been many attempts to reverse-engineer some of the client-side code, as well as to theorize about how the server-side logic operates. Initial attempts at reverse-engineering ReCAPTCHA show copious amounts of information belong collected, including but not limited to: plugins, user agent, IP address, screen resolution, execution times, timezone, language, click/keyboard/touch information within the frame of the captcha, test results of many browser-specific functions and CSS evaluation, information about canvas element rendering, and cookies, including those affiliated with your Google account that were placed within the last 6 months.
https://nearcyan.com/you-probably-dont-need-recaptcha/ (mirror)
Although determining exactly how ReCAPTCHA works is very difficult, with Google not only heavily obfuscating its JavaScript, but also implementing an entire VM in JavaScript with its own bytecode language, there have still been many attempts to reverse-engineer some of the client-side code, as well as to theorize about how the server-side logic operates. Initial attempts at reverse-engineering ReCAPTCHA show copious amounts of information belong collected, including but not limited to: plugins, user agent, IP address, screen resolution, execution times, timezone, language, click/keyboard/touch information within the frame of the captcha, test results of many browser-specific functions and CSS evaluation, information about canvas element rendering, and cookies, including those affiliated with your Google account that were placed within the last 6 months.
https://nearcyan.com/you-probably-dont-need-recaptcha/ (mirror)