Job Title:
Web Crawling Analyst ( 2 to 10 yrs )
Company: AIMLEAP
Location: Vapi, Gujarat
Created: 2026-01-07
Job Type: Full Time
Job Description:
Web Crawling Analyst Experience: 3–10 YearsLocation: Remote (Work from Home)Mode of Engagement: Full-timeNo. of Positions: 8Educational Qualification: Bachelor’s degree in Computer Science, IT, Data Engineering, or related fieldIndustry: IT / Software Services / Data & AINotice Period: Immediate to 30 Days (Preferred) What We Are Looking For3–10 years of hands-on experience in web crawling and browser-based scraping, especially on JavaScript-heavy and protected websites.Strong expertise with Playwright, Selenium, or Puppeteer for dynamic rendering and complex user flows.Practical experience handling cookies, sessions, headers, local storage, and authentication workflows.Proven ability to manage CAPTCHA challenges using third-party services or AI-based solvers.Solid understanding of proxy rotation, IP management, user-agent and fingerprinting techniques to avoid detection and rate limits.Capability to design scalable and resilient crawling pipelines with retries, logging, and monitoring. ResponsibilitiesDesign, develop, and maintain high-scale web crawling workflows for dynamic and protected websites.Implement advanced browser automation solutions using Playwright / Selenium / Puppeteer.Integrate CAPTCHA-solving services, proxy rotation mechanisms, and anti-detection strategies.Build ETL-style data pipelines for extraction, validation, transformation, and storage.Ensure data quality through error handling, retries, monitoring, and alerting.Store structured data efficiently using SQL/NoSQL databases.Collaborate with AI, data engineering, and product teams to deliver reliable crawling datasets.Continuously improve crawling success rates, performance, and scalability. QualificationsMinimum 3 years of hands-on experience in Python-based web crawling and automation.Strong working experience with Playwright, Selenium, Puppeteer, and browser automation.Proficient in Python, including libraries such as Requests, BeautifulSoup, Scrapy, and async frameworks.Hands-on experience with proxies, fingerprinting, session handling, and anti-bot mechanisms.Good understanding of SQL / NoSQL databases for structured data storage.Exposure to cloud platforms (AWS / GCP / Azure) is a plus.Strong debugging, analytical, and problem-solving skills.