Skills:
Unstructured data processing, Data Modelling, Data Optimizing Algorithm, Python, R, Java, Ruby, Clojure, Matlab, Pig, Weka, NumPy, k-NN, Naive Bayes, SVM, Decision Forests, D3.js, GGplot, Hadoop Ecosystem
Programming Experience:
8+ years of Software Development experience
2+ years of Data Scientist experience
Experience with common data science toolkits, such as Python, R, Ruby, Weka, NumPy, MatLab, etc. Excellence in at least one of these is highly desirable.
Experience with any one of the NoSQL databases, such as MongoDB, Cassandra, HBase
Proficiency in using query languages such as SQL, Hive, Pig
Good applied statistics skills, such as distributions, statistical testing, regression, etc. Eg: Softmax function
Good scripting and programming skills with ability to explain in form of flowcharts and pseudo code
Responsibilities:
Execute and code analytic projects in response to business needs.
In conjunction with data owners and department managers, contribute to the development of data models and protocols for mining production databases.
Understand data mining architectures, modelling standards, reporting, and data analysis methodologies.
Work with application developers to extract data relevant for analysis.
Collaborate with unit managers, end users, development staff, and other stakeholders to integrate data mining results with existing systems.
Provide and apply quality assurance best practices for data mining/analysis services.
Determine required network components to ensure data access, as well as data consistency and integrity.
Respond to and resolve data mining performance issues. Monitor data mining system performance and implement efficiency improvements.
Being able to work autonomously
Preferred / Good to have Experience:
Any one of the dimensional modelling tool experience
Experience in building a modern, heterogeneous Data Lake
Expertise in Hadoop ecosystem programming
Good understanding of Networking concepts
Knowledge about Cyber Threats, vulnerabilities, and consequences
#LI-AS1