Browse

···

Log in / Register

Java Developer with Web Crawler Experience

Negotiable Salary

Axiom Software Solutions Limited

Austin, TX, USA

Favourites

Share

Description

Role: Java Developer with Web Crawler Experience Location: Austin TX(Hybrid) Responsibilities: 1. Web Crawler Development: Design and implement efficient and scalable web crawlers in Java to collect data from various online sources. 2. Data Extraction: Develop and maintain systems for structured data extraction, handling various data formats (HTML, JSON, XML, etc.). 3. Data Storage and Processing: Design data storage and processing pipelines, ensuring extracted data is clean, structured, and easily accessible. 4. Performance Optimization: Optimize web crawling processes for speed, efficiency, and accuracy, while ensuring minimal impact on source websites. 5. Error Handling and Logging: Implement error-handling mechanisms and logging systems to detect and resolve issues during crawling operations. 6. Data Integrity and Compliance: Ensure data collection practices are ethical, legal, and compliant with relevant regulations (e.g., robots.txt, copyright laws). Requirements: Proficiency in Java and experience with Java-based web scraping libraries (e.g., Jsoup, Apache HttpClient). Knowledge of web crawling frameworks and tools, such as Scrapy, Selenium, or Puppeteer. Strong understanding of HTML, CSS, JavaScript, and web data structures. Familiarity with data parsing and handling techniques for JSON, XML, and other common formats. Experience with database technologies (SQL, NoSQL) to store and manage scraped data. Knowledge of HTTP protocols, headers, proxies, and load handling.

Source: workable View original post

Location

Austin, TX, USA

Show map

workable

You may also like

Sr. Data Scientist

We are seeking a talented and experienced Sr. Data Scientist to join our team. As a Data Scientist, you will play a crucial role in designing, developing, and implementing data science solutions to empower investigators and enhance our capabilities in extracting meaningful information and identifying anomalous patterns of activity. Responsibilities: Collaborate with subject matter experts, team leads, and third-party vendors to define new features and functions for automation, aiding investigators in extracting meaningful information about a target and their surrounding network. Design, code, test, and document data science microservices primarily in Python. Support the integration of disparate bulk data sources into a unified database. Develop and optimize graph traversal queries and analytic pipelines to support analyst use cases, ensuring smooth transition from development to test and production environments. Extract valuable information from unstructured text, including SAR narratives and web scraped data related to cryptocurrency addresses and actors. Generate synthetic data for testing and development environments, as well as for the MM capstone training, adapting to evolving MM training and data holdings. Requirements Proficient in Python programming. Experience with graph traversal languages such as Gremlin, Cypher, or GraphML, along with expertise in network analytics, including centrality, community detection, link prediction, pattern recognition, and blockchain analytics (Preferred). Strong SQL or other relational database query experience. In-depth knowledge of graph-structured data and analytics. Familiarity with Natural Language Processing (NLP) techniques. Background in financial and banking data analytics, including insights from data and unstructured data-information extraction. Expertise in blockchain architecture and cryptocurrency data analytics. Knowledge of GPS technology and its integration into analytical processes. Experience with cloud platforms, specifically Amazon Web Services (AWS). Understanding of machine learning algorithms and their application in cybersecurity analytics. Qualifications: Bachelor's or advanced degree in Computer Science, Data Science, or a related field. 5 plus years’ experience in a similar role Strong communication and collaboration skills. Ability to work in a dynamic and fast-paced environment. An active Top Secret clearance is required. #IND123

Negotiable Salary

Senior Big Data Engineer

ABOUT US: Headquartered in the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked as the world’s top provider of Wi-Fi devices. The company is committed to delivering innovative products that enhance people’s lives through faster, more reliable connectivity. With a commitment to excellence, TP-Link serves customers in over 170 countries and continues to grow its global footprint. We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect users worldwide with the wonders of technology.  Embracing professionalism, innovation, excellence, and simplicity, we aim to assist our clients in achieving remarkable global performance and enable consumers to enjoy a seamless, effortless lifestyle.  KEY RESPONSIBILITIES Develop and maintain the Big Data Platform by performing data cleansing, data warehouse modeling, and report development on large datasets. Collaborate with cross-functional teams to provide actionable insights for decision-making. Manage the operation and administration of the Big Data Platform, including system deployment, task scheduling, proactive monitoring, and alerting to ensure stability and security. Handle data collection and integration tasks, including ETL development, data de-identification, and managing data security. Provide support for other departments by processing data, writing queries, developing solutions, performing statistical analysis, and generating reports. Troubleshoot and resolve critical issues, conduct fault diagnosis, and optimize system performance. Requirements REQUIRED QUALIFICATIONS Bachelor’s degree or higher in Computer Science or a related field, with at least three years of experience maintaining a Big Data platform. Strong understanding of Big Data technologies such as Hadoop, Flink, Spark, Hive, HBase, and Airflow, with proven expertise in Big Data development and performance optimization. Familiarity with Big Data OLAP tools like Kylin, Impala, and ClickHouse, as well as experience in data warehouse design, data modeling, and report generation. Proficiency in Linux development environments and Python programming. Excellent communication, collaboration, and teamwork skills, with a proactive attitude and a strong sense of responsibility. PREFERRED QUALIFICAITONS Experience with cloud-based deployments, particularly AWS EMR, with familiarity in other cloud platforms being a plus. Proficiency in additional languages such as Java or Scala is a plus. Benefits Salary Range: $150,000 - $180,000 Free snacks and drinks, and provided lunch on Fridays Fully paid medical, dental, and vision insurance (partial coverage for dependents) Contributions to 401k funds Bi-annual reviews, and annual pay increases Health and wellness benefits, including free gym membership Quarterly team-building events At TP-Link Systems Inc., we are continually searching for ambitious individuals who are passionate about their work. We believe that diversity fuels innovation, collaboration, and drives our entrepreneurial spirit. As a global company, we highly value diverse perspectives and are committed to cultivating an environment where all voices are heard, respected, and valued. We are dedicated to providing equal employment opportunities to all employees and applicants, and we prohibit discrimination and harassment of any kind based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Beyond compliance, we strive to create a supportive and growth-oriented workplace for everyone. If you share our passion and connection to this mission, we welcome you to apply and join us in building a vibrant and inclusive team at TP-Link Systems Inc. Please, no third-party agency inquiries, and we are unable to offer visa sponsorships at this time.

Irvine, CA, USA

$150,000-180,000/year

Help

Cookie

Cookie Settings

Our Apps

Download on the

APP Store

Download

Get it on

Google Play

© 2025 Servanan International Pte. Ltd.