Find Jobs
Hire Freelancers

Data Integration and profiling

€8-30 EUR

Stengt
Lagt ut 4 måneder siden

€8-30 EUR

Betalt ved levering
I'm looking for a skilled data analyst to undertake data profiling and data curation project. For the Data Curation Final Project, you should choose a domain of interest (e.g., movies, travel booking, sports, tournaments, etc.) for which you can identify a collection of (possibly open) data sources that need to be integrated. The collection should contain data from at least two different sources. Data Profiling Part 1. Export the datasets of interest and perform Elementary Data Analysis (EDA) by means of either existing tools and libraries or your own EDA algorithm implementations. Report the results of the EDA and comment on them in the final project documentation. 2. Design one or more relational databases to structure and store the exported datasets and show how the dependency discovery algorithms (see, UCC, FD, and IND) introduced in the course have been used to extract metadata that supported the database/s designing process. 3. The results obtained by profiling the datasets for dependency discovery, and the consequent database schema design choices, must be properly commented in the final project documentation. Data Preparation and Integration Part 1. Consider the domain (and the data-sets within that domain) that you have chosen for the Data Profiling part, and model the domain by means of an ontology/mediated schema. The ontology should represent all information that is relevant for the domain of interest, and it should be rich enough (at least 10 classes, plus the corresponding object and data properties). 2. Represent the ontology/mediated schema using the graphical notation introduced in the course (or similar). 3. Represent the ontology in Prot´eg´e. 4. Consider the relational database (or databases) that you have designed for the Data Profiling module, including the relevant constraints that you have specified and/or extracted from the data. Note that the choice of having multiple databases, instead of a single one, has an impact on the technological solution that you will need to deploy. In particular, having multiple databases to integrate implies the necessity to rely on a federation system (like Teiid, Denodo, or Dremio, for instance) as an intermediate layer between the sources and ontop. This is obviously not needed if you work with a single database. 5. Design VKG mappings to connect the ontology to the database (or the federated relational schema exposed by the federation system), using the Ontop Plugin for Prot´eg´e (or Ontopic Studio). 6. Develop an application (e.g., in Java) for your domain that makes use of Ontop as a SPARQL endpoint to query the database through the ontology, extracting information that is of interest for your domain of choice. As an example for the kinds of queries that could be posed via your application, you can consider the queries underlying travel booking sites, where some parameters of a request are filled in via a form (e.g., the departure city, departure and arrival date and time, etc.), and answers are retrieved using those parameters. (You have to take into account the SPARQL fragment that is supported by the current version of Ontop.)
Prosjekt-ID: 37630494

Om prosjektet

5 forslag
Eksternt prosjekt
Aktiv 3 måneder siden

Ønsker du å tjene penger?

Fordeler med budgivning på Freelancer

Angi budsjettet og tidsrammen
Få betalt for arbeidet ditt
Skisser forslaget ditt
Det er gratis å registrere seg og by på jobber
5 frilansere byr i gjennomsnitt €20 EUR for denne jobben
Brukeravatar
Dear Arash E., Hope you doing well. I am excited to offer my expertise in statistical analysis, offering a range of services covering both basic and advanced statistical techniques. With proficiency in SPSS, Stata, Eviews, and Python, I am well-equipped to handle diverse data analysis projects. My expertise include but not limited to: Descriptive Statistics: Summarizing data using mean, median, and mode, along with standard deviation and variance. Hypothesis Testing: Employing t-tests, ANOVA, and Chi-Square tests to draw inferences. Correlation Analysis: Evaluating relationships between variables using Pearson, Spearman, or Kendall correlations. Advanced Statistical Analysis: Regression Analysis: Building predictive models with linear and logistic regression. Time Series Analysis: Identifying trends and patterns in time-based data. Factor Analysis and Principal Component Analysis: Reducing dimensionality for data simplification. Multivariate Analysis: Exploring complex relationships among multiple variables. Sincerely Sajid Hussain
€8 EUR om 1 dag
4,5 (3 omtaler)
3,1
3,1
Brukeravatar
Hello. I read your requirement i will do that. Please come on chat we will discuss more about this. I will waiting your reply.
€30 EUR om 1 dag
4,9 (2 omtaler)
2,5
2,5
Brukeravatar
Hi there Arash E., Good morning! I am professional data scientist with skills including Ontology, Data Collection, Data Analysis and Data Integration. I hold an masters degree in data analysis which provides me with the necessary background to handle your project. Having done similar projects, I can deliver quality and superior work at a price we are both comfortable with and within the agreed timeline. Please send a message to discuss more about this project. With regards, Lincoln
€25 EUR om 2 dager
2,0 (1 omtale)
2,4
2,4
Brukeravatar
I am an experienced data analyst ready to undertake your data profiling and curation project. The focus is on a domain of interest (e.g., movies, travel booking, sports) where data from at least two sources needs integration. Data Profiling: Export datasets and perform Elementary Data Analysis (EDA) using existing tools or custom algorithms. Design relational databases using dependency discovery algorithms (UCC, FD, IND) to structure and store datasets. Document and comment on profiling results and schema design choices. Data Preparation and Integration: Model the chosen domain using an ontology/mediated schema. Represent the ontology graphically and in Protégé. Consider the designed relational database(s) and constraints. Choose between a single or multiple databases, considering the need for a federation system. Design VKG mappings connecting ontology to the database using Ontop Plugin for Protégé. Develop a Java application utilizing Ontop as a SPARQL endpoint for querying the database through the ontology. Why Choose Me: Proven expertise in data profiling and integration. Extensive experience with Ontop, Protégé, and Java. Strong analytical and documentation skills. Efficient in designing complex database schemas. Budget: Let's discuss a reasonable budget based on the intricacies of the project. Looking forward to bringing your vision to fruition through efficient data profiling and curation. Excited to collaborate on this comprehensive project!
€20 EUR om 7 dager
0,0 (0 omtaler)
0,0
0,0
Brukeravatar
As a data analyst, I possess a diverse set of skills and characteristics that make me a strong candidate for this project. Some of the key characteristics of a comprehensive report include: 1. Data Proficiency: Ability to manipulate and analyze data to answer organizational questions. 2. Attention to Detail: Paying attention to details to question or manage suspicious events during data analysis. 3. Commercial Awareness: Understanding of business operations and customer needs. 4. Communication Skills: Excellent verbal and written communication for presenting findings and insights. 5. Analytical Skills: Proficiency in problem-solving and working with numbers. 6. Curiosity: A genuine desire to seek answers and develop understanding. 7. Meticulousness: Being thorough and detailed in data analysis 8. Technical Skills: Proficiency in tools like SQL, data visualization, and database querying languages.
€19 EUR om 7 dager
0,0 (0 omtaler)
0,0
0,0

Om klienten

ITALYs flagg
Italy
0,0
0
Medlem siden jan. 9, 2024

Klientbekreftelse

Takk! Vi har sendt deg en lenke for at du skal kunne kreve din gratis kreditt.
Noe gikk galt. Vær så snill, prøv på nytt.
Registrerte brukere Publiserte jobber
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Forhåndsvisning innlasting
Tillatelse gitt for geolokalisering.
Påloggingsøkten din er utløpt og du har blitt logget ut. Logg på igjen.