Browse
···
Log in / Register

Java Developer with Web Crawler Experience

Negotiable Salary

Axiom Software Solutions Limited

Austin, TX, USA

Favourites
Share

Description

Role: Java Developer with Web Crawler Experience Location: Austin TX(Hybrid) Responsibilities: 1. Web Crawler Development: Design and implement efficient and scalable web crawlers in Java to collect data from various online sources. 2. Data Extraction: Develop and maintain systems for structured data extraction, handling various data formats (HTML, JSON, XML, etc.). 3. Data Storage and Processing: Design data storage and processing pipelines, ensuring extracted data is clean, structured, and easily accessible. 4. Performance Optimization: Optimize web crawling processes for speed, efficiency, and accuracy, while ensuring minimal impact on source websites. 5. Error Handling and Logging: Implement error-handling mechanisms and logging systems to detect and resolve issues during crawling operations. 6. Data Integrity and Compliance: Ensure data collection practices are ethical, legal, and compliant with relevant regulations (e.g., robots.txt, copyright laws). Requirements: Proficiency in Java and experience with Java-based web scraping libraries (e.g., Jsoup, Apache HttpClient). Knowledge of web crawling frameworks and tools, such as Scrapy, Selenium, or Puppeteer. Strong understanding of HTML, CSS, JavaScript, and web data structures. Familiarity with data parsing and handling techniques for JSON, XML, and other common formats. Experience with database technologies (SQL, NoSQL) to store and manage scraped data. Knowledge of HTTP protocols, headers, proxies, and load handling.

Source:  workable View original post

Location
Austin, TX, USA
Show map

workable

You may also like

Workable
Senior Big Data Engineer
ABOUT US: Headquartered in the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked as the world’s top provider of Wi-Fi devices. The company is committed to delivering innovative products that enhance people’s lives through faster, more reliable connectivity. With a commitment to excellence, TP-Link serves customers in over 170 countries and continues to grow its global footprint. We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect users worldwide with the wonders of technology.  Embracing professionalism, innovation, excellence, and simplicity, we aim to assist our clients in achieving remarkable global performance and enable consumers to enjoy a seamless, effortless lifestyle.  KEY RESPONSIBILITIES Develop and maintain the Big Data Platform by performing data cleansing, data warehouse modeling, and report development on large datasets. Collaborate with cross-functional teams to provide actionable insights for decision-making. Manage the operation and administration of the Big Data Platform, including system deployment, task scheduling, proactive monitoring, and alerting to ensure stability and security. Handle data collection and integration tasks, including ETL development, data de-identification, and managing data security. Provide support for other departments by processing data, writing queries, developing solutions, performing statistical analysis, and generating reports. Troubleshoot and resolve critical issues, conduct fault diagnosis, and optimize system performance. Requirements REQUIRED QUALIFICATIONS Bachelor’s degree or higher in Computer Science or a related field, with at least three years of experience maintaining a Big Data platform. Strong understanding of Big Data technologies such as Hadoop, Flink, Spark, Hive, HBase, and Airflow, with proven expertise in Big Data development and performance optimization. Familiarity with Big Data OLAP tools like Kylin, Impala, and ClickHouse, as well as experience in data warehouse design, data modeling, and report generation. Proficiency in Linux development environments and Python programming. Excellent communication, collaboration, and teamwork skills, with a proactive attitude and a strong sense of responsibility. PREFERRED QUALIFICAITONS Experience with cloud-based deployments, particularly AWS EMR, with familiarity in other cloud platforms being a plus. Proficiency in additional languages such as Java or Scala is a plus. Benefits Salary Range: $150,000 - $180,000 Free snacks and drinks, and provided lunch on Fridays Fully paid medical, dental, and vision insurance (partial coverage for dependents) Contributions to 401k funds Bi-annual reviews, and annual pay increases Health and wellness benefits, including free gym membership Quarterly team-building events At TP-Link Systems Inc., we are continually searching for ambitious individuals who are passionate about their work. We believe that diversity fuels innovation, collaboration, and drives our entrepreneurial spirit. As a global company, we highly value diverse perspectives and are committed to cultivating an environment where all voices are heard, respected, and valued. We are dedicated to providing equal employment opportunities to all employees and applicants, and we prohibit discrimination and harassment of any kind based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Beyond compliance, we strive to create a supportive and growth-oriented workplace for everyone. If you share our passion and connection to this mission, we welcome you to apply and join us in building a vibrant and inclusive team at TP-Link Systems Inc. Please, no third-party agency inquiries, and we are unable to offer visa sponsorships at this time.
Irvine, CA, USA
$150,000-180,000/year
Craigslist
Prep Cook (Los Angeles)
TopNotch Catering is a growing company that caters lunch, breakfast, snack, and supper to child nutrition programs across Los Angeles County. TopNotch provides fresh-made meals for K-12 schools, head start programs, and community organizations. We partner with local growers to provide fresh produce and create innovative and accessible meals for children. We are a fast-paced company with opportunities for growth and hands-on experience. We are looking for a Prep Cook to join our team. Responsibilities include but are not limited to: Complete all tasks as handed down by the kitchen manager and supervisors. Wash, cut, and otherwise prepare all foods to continue service and for the next day’s service. Measure and prepare dressings, sauces, and dry mixes by following recipes. Label, date, and organize all prepped food in proper storage areas. Keep the kitchen area, service area, and storage area clean and organized. Keep track of ingredient inventory and communicate with appropriate personnel to keep par. Requirements: Must be able to push and lift up to 50 lbs. Arrives to workstation on time, appropriately groomed, dressed, and ready to work; works all scheduled shifts and attends required training and meetings. Strong communication skills and willingness to work as part of a team. Preferred qualifications: General kitchen experience ¡Se habla espanol esta bien! Learn large batch and high-volume cooking! This position is full-time The schedule would be Monday through Friday or Sun through Thursday would start between 4 am and 5a with overtime flexibility. We are centrally located in Los Angeles City with easy access to public transportation. Benefits include paid sick time, paid vacation time, 401K, life insurance, and disability. Medical Insurance in the US or Mexico. The rate of pay is hourly, $18 to $20 based on experience. Benefits include paid sick time, paid vacation time, 401K, life insurance, and disability. Medical Insurance in the US or Mexico. The rate of pay is hourly, $18 to $20 based on experience.
4900 S Boyle Ave, Vernon, CA 90058, USA
$18-20/hour
Workable
Senior Data Analyst
Who We Are and Why Join Us At OnMed our purpose is simple but powerful...to improve the quality of life and sense of well-being in our communities by bringing access to healthcare to everyone, everywhere. Our path to everywhere has already begun, with our innovative CareStation, a small but mighty, Clinic-in-a-Box, bringing #healthcareaccess anywhere with an outlet to plug it in. Poised to become a key component in America’s public health infrastructure, the OnMed CareStation is the only tech-enabled, human-led, hybrid care solution that combines the comprehensive experience, trust and outcomes of a clinic, with the rapid scalability of virtual care. At OnMed, every role, every day, is directly impacting the communities we serve. You’ll join a high-performing purpose-driven team, innovating to break down the barriers that keep people from the care they need. This is not just a job...it's a movement to bring access to healthcare where and when people need it most. It’s healthcare that shows up. Who You Are You are a highly skilled data analytics professional with deep expertise in Power BI, Tableau, and advanced analytics. You specialize in transforming complex data into actionable insights, compelling visual narratives, and impactful business reports that support strategic decision-making. With a strong focus on innovation and agility, you thrive in dynamic environments and are passionate about leveraging data and technology to improve access to quality healthcare. Requirements Role’s Responsibilities  Collaborate with developers, engineers, and leadership to design and optimize the Databricks environment, ensuring robust governance, performance monitoring, and continuous improvement through data-driven insights. Gather and synthesize business and technical data requirements; apply best practices to deliver impactful insights and reporting aligned with strategic objectives. Analyze large-scale datasets to uncover trends, patterns, and opportunities for operational and clinical enhancements. Ensure data quality, consistency, and reliability across all analytics and reporting platforms; proactively identify and resolve data gaps or quality issues. Develop and optimize complex SQL queries to support data integration, transformation, and reporting processes. Build and manage scalable data pipelines, leveraging Databricks and Azure-based solutions including Azure Data Factory and Azure API Management. Monitor and troubleshoot data workflows to ensure smooth and efficient operations, providing technical documentation and stakeholder support as needed. Support ad-hoc reporting and data requests with a strong emphasis on accuracy, clarity, and turnaround time. Stay current with industry trends in data analytics, visualization technologies, and evolving healthcare data standards. Perform other related role's responsibilities as assigned. Knowledge, Skills & Abilities Extensive experience in data science, analytics, statistical modeling, and ETL processes. Skilled in managing and interpreting large, complex datasets to generate meaningful business insights. Experienced in Agile engineering methodologies and collaborative development practices. Proficient in Power BI, Tableau, SQL, and Python for data analysis and visualization. Advanced SQL expertise, including writing and optimizing complex queries for performance and scalability. Strong understanding of Azure cloud services and architecture, with hands-on experience in cloud-based data solutions. Practical experience with Databricks and other modern data platforms and cloud data warehouses. Excellent problem-solving abilities and strong attention to detail in both technical and analytical tasks. Effective communicator with proven collaboration skills across technical and non-technical teams. Well-versed in data governance, security principles, and best practices for enterprise data management. Education & Experience Bachelor’s degree in Computer Science, Data Science, or a related field. 3–5 years of professional experience in data analytics, data science, or related roles. Required Technical Skills: Proficient in Power BI, Tableau, and SQL, with working experience in Databricks and Python. Preferred Skills: Familiarity with Azure or AWS, particularly in solving big data challenges; experience with C# and PySpark is a plus. Experience with AI tools is highly desirable. Background in healthcare data or experience within the healthcare industry is a strong advantage. Benefits Benefits OnMed provides a competitive salary and benefits package, including unlimited PTO and paid holidays. The base salary range for this role is $140,000 - $160,000 commensurate with the candidate's experience. OnMed is a proud equal opportunity employer. All qualified applicants will be considered without regard to race, color, creed, religion, gender, sexual orientation, national origin, genetic information, disability, age, marital status, veteran status, or any other category protected by law. #LI-HYBRID
White Plains, NY, USA
$140,000-160,000/year
Workable
Site Reliability Engineer (req-174)
Team CATHEXIS elevates the government contracting experience through rapid response, deep skill, and thoughtful problem-solving and communication. Our core capabilities are our top-tier program and project management, data analytics, and audit services, the backbone of which is our integrated approach to operational excellence. You worked hard to get to where you are. You strive to make every day better than the day before. So do we. Team CATHEXIS operates with an all-in mindset. We are working together to create a company that supports our shared values and individual goals. Our values are centered around Respect, Engagement, Customer Service, Integrity, Teamwork, and Excellence in everything we do for our employees, clients, partners, and communities. We believe success is best when we listen and lead with empathy; model high standards of ethics to provide a rewarding candidate experience; work hard, have fun, and appreciate the strengths we all bring to the team; and empower our employees to create innovative and trusted results. We are looking for a dynamic Site Reliability Engineer (SRE) to join our team.  The Site Reliability Engineer (SRE) will manage, monitor, and optimize our clusters on Kubernetes. Together, we’re accelerating our clients’ digital transformation through the building and deployment of data-driven, scalable AI solutions.  The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability and scalability of our Kubernetes clusters and Cloud Infrastructure. Responsibilities: Monitor and Manage Kubernetes Clusters: Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes Kubernetes Management: Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance Cloud Infrastructure Management: Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.) Monitoring & Incident Response: Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, or Kubernetes clusters Automate Infrastructure Processes: Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent Collaborate Across Teams: Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure Security & Compliance: Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning Requirements: Active Secret Clearance is required Bachelor’s degree (or equivalent) in computer science or related discipline A minimum of two(2) years of experience working with on-premise and off-premise cloud environments Experience with AWS, Azure and / or GCP Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++, Ruby, and JavaScript Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn) Proactive approach to identifying problems, performance bottlenecks, and areas for improvement Agile/Scrum experience CATHEXIS offers competitive compensation packages to all eligible employees. Our goal is to provide a compensation package that reflects the value you bring to our team, is competitive with market rates, and promotes your financial security and personal well-being. The annual salary range for this role is $136,000 - $170,000. Please note that the salary information provided is a general guideline. CATHEXIS considers various factors in its final offer, including location, qualifications, experience, and skills.  CATHEXIS is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law. If you are an individual with a disability and would like to request a reasonable accommodation as part of the employment selection process, please contact the Recruiting@cathexiscorp.com.
Tysons, VA, USA
$136,000-170,000/year
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.