Browse
···
Log in / Register

Java Developer with Web Crawler Experience

Negotiable Salary

Axiom Software Solutions Limited

Austin, TX, USA

Favourites
Share

Description

Role: Java Developer with Web Crawler Experience Location: Austin TX(Hybrid) Responsibilities: 1. Web Crawler Development: Design and implement efficient and scalable web crawlers in Java to collect data from various online sources. 2. Data Extraction: Develop and maintain systems for structured data extraction, handling various data formats (HTML, JSON, XML, etc.). 3. Data Storage and Processing: Design data storage and processing pipelines, ensuring extracted data is clean, structured, and easily accessible. 4. Performance Optimization: Optimize web crawling processes for speed, efficiency, and accuracy, while ensuring minimal impact on source websites. 5. Error Handling and Logging: Implement error-handling mechanisms and logging systems to detect and resolve issues during crawling operations. 6. Data Integrity and Compliance: Ensure data collection practices are ethical, legal, and compliant with relevant regulations (e.g., robots.txt, copyright laws). Requirements: Proficiency in Java and experience with Java-based web scraping libraries (e.g., Jsoup, Apache HttpClient). Knowledge of web crawling frameworks and tools, such as Scrapy, Selenium, or Puppeteer. Strong understanding of HTML, CSS, JavaScript, and web data structures. Familiarity with data parsing and handling techniques for JSON, XML, and other common formats. Experience with database technologies (SQL, NoSQL) to store and manage scraped data. Knowledge of HTTP protocols, headers, proxies, and load handling.

Source:  workable View original post

Location
Austin, TX, USA
Show map

workable

You may also like

Workable
Salesforce Developer II ( remote )
We are seeking a candidate with strong development experience in AGILE projects using Apex, Visualforce and HTML5/JS tools, in an enterprise environment utilizing structured SDLC processes. Good candidates will have the ability to adjust rapidly to a dynamic setting and be able to adopt established development team standards and processes to deliver high quality customer facing web applications. Requirements Responsibilities: Design and develop dynamic, secure, high quality business solutions on the Force.com platform, for the healthcare industry. Create Data Dictionaries Generate application flow charts and technical documentation Defining technical specifications to meet business requirements for custom applications Function in an Agile, structured SDLC team environment Perform unit and integration testing Develop Distributed Integrations between Salesforce and proprietary Enterprise Applications Skills: Salesforce development (minimum 3 years of experience) SOQL (Salesforce Object Query Language) Visualforce/Lightning Components/Aura Components, Apex Proven experience in troubleshooting and solving complex logic problems Desired knowledge on Platform Events Desired knowledge on C#.NET (minimum 3 years of experience) Desired knowledge on .NET Standard/.NET Core Desired knowledge on Team Foundation Server/Azure DevOps Desired knowledge on SFDX Must demonstrate good communication skills Must be highly motivated, proactive, creative and thorough Must be able to thrive in a fast paced, Agile team environment Benefits Supportive, progressive, fast-paced environment Competitive pay structure Matching 401(k) with immediate vesting Medical, dental, vision, life, & short-term disability insurance AssistRx, Inc. is proud to be an Equal Opportunity Employer. All qualified applicants will receive consideration without regard to race, religion, color, sex (including pregnancy, gender identity, and sexual orientation), parental status, national origin, age, disability, family medical history or genetic information, political affiliation, military service, or other non-merit based factors, or any other protected categories protected by federal, state, or local laws. All offers of employment with AssistRx are conditional based on the successful completion of a pre-employment background check. In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire. Sponsorship and/or work authorization is not available for this position. AssistRx does not accept unsolicited resumes from search firms or any other vendor services. Any unsolicited resumes will be considered property of AssistRx and no fee will be paid in the event of a hire
Orlando, FL, USA
Negotiable Salary
Workable
Senior Big Data Engineer
ABOUT US: Headquartered in the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked as the world’s top provider of Wi-Fi devices. The company is committed to delivering innovative products that enhance people’s lives through faster, more reliable connectivity. With a commitment to excellence, TP-Link serves customers in over 170 countries and continues to grow its global footprint. We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect users worldwide with the wonders of technology.  Embracing professionalism, innovation, excellence, and simplicity, we aim to assist our clients in achieving remarkable global performance and enable consumers to enjoy a seamless, effortless lifestyle.  KEY RESPONSIBILITIES Develop and maintain the Big Data Platform by performing data cleansing, data warehouse modeling, and report development on large datasets. Collaborate with cross-functional teams to provide actionable insights for decision-making. Manage the operation and administration of the Big Data Platform, including system deployment, task scheduling, proactive monitoring, and alerting to ensure stability and security. Handle data collection and integration tasks, including ETL development, data de-identification, and managing data security. Provide support for other departments by processing data, writing queries, developing solutions, performing statistical analysis, and generating reports. Troubleshoot and resolve critical issues, conduct fault diagnosis, and optimize system performance. Requirements REQUIRED QUALIFICATIONS Bachelor’s degree or higher in Computer Science or a related field, with at least three years of experience maintaining a Big Data platform. Strong understanding of Big Data technologies such as Hadoop, Flink, Spark, Hive, HBase, and Airflow, with proven expertise in Big Data development and performance optimization. Familiarity with Big Data OLAP tools like Kylin, Impala, and ClickHouse, as well as experience in data warehouse design, data modeling, and report generation. Proficiency in Linux development environments and Python programming. Excellent communication, collaboration, and teamwork skills, with a proactive attitude and a strong sense of responsibility. PREFERRED QUALIFICAITONS Experience with cloud-based deployments, particularly AWS EMR, with familiarity in other cloud platforms being a plus. Proficiency in additional languages such as Java or Scala is a plus. Benefits Salary Range: $150,000 - $180,000 Free snacks and drinks, and provided lunch on Fridays Fully paid medical, dental, and vision insurance (partial coverage for dependents) Contributions to 401k funds Bi-annual reviews, and annual pay increases Health and wellness benefits, including free gym membership Quarterly team-building events At TP-Link Systems Inc., we are continually searching for ambitious individuals who are passionate about their work. We believe that diversity fuels innovation, collaboration, and drives our entrepreneurial spirit. As a global company, we highly value diverse perspectives and are committed to cultivating an environment where all voices are heard, respected, and valued. We are dedicated to providing equal employment opportunities to all employees and applicants, and we prohibit discrimination and harassment of any kind based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Beyond compliance, we strive to create a supportive and growth-oriented workplace for everyone. If you share our passion and connection to this mission, we welcome you to apply and join us in building a vibrant and inclusive team at TP-Link Systems Inc. Please, no third-party agency inquiries, and we are unable to offer visa sponsorships at this time.
Irvine, CA, USA
$150,000-180,000/year
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.