Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

March 20 2015

12:05

Presenting public finance just got easier

mexico_ckan_openspending

CKAN 2.3 is out! The world-famous data handling software suite which powers data.gov, data.gov.uk and numerous other open data portals across the world has been significantly upgraded. How can this version open up new opportunities for existing and coming deployments? Read on.

One of the new features of this release is the ability to create extensions that get called before and after a new file is uploaded, updated, or deleted on a CKAN instance.

This may not sound like a major improvement  but it creates a lot of new opportunities. Now it’s possible to analyse the files (which are called resources in CKAN) and take them to new uses based on that analysis. To showcase how this works, Open Knowledge in collaboration with the Mexican government, the World Bank (via Partnership for Open Data), and the OpenSpending project have created a new CKAN extension which uses this new feature.

It’s actually two extensions. One, called ckanext-budgets listens for creation and updates of resources (i.e. files) in CKAN and when that happens the extension analyses the resource to see if it conforms to the data file part of the Budget Data Package specification. The budget data package specification is a relatively new specification for budget publications, designed for comparability, flexibility, and simplicity. It’s similar to data packages in that it provides metadata around simple tabular files, like a csv file. If the csv file (a resource in CKAN) conforms to the specification (i.e. the columns have the correct titles), then the extension automatically creates the Budget Data Package metadata based on the CKAN resource data and makes the complete Budget Data Package available.

It might sound very technical, but it really is very simple. You add or update a csv file resource in CKAN and it automatically checks if it contains budget data in order to publish it on a standardised form. In other words, CKAN can now automatically produce standardised budget resources which make integration with other systems a lot easier.

The second extension, called ckanext-openspending, shows how easy such an integration around standardised data is. The extension takes the published Budget Data Packages and automatically sends it to OpenSpending. From there OpenSpending does its own thing, analyses the data, aggregates it and makes it very easy to use for those who use OpenSpending’s visualisation library.

So thanks to a perhaps seemingly insignificant extension feature in CKAN 2.3, getting beautiful and understandable visualisations of budget spreadsheets is now only an upload to a CKAN instance away (and can only get easier as the two extensions improve).

To learn even more, see this report about the CKAN and OpenSpending integration efforts.

March 11 2015

20:18

If ‘Change’ had a favourite number…it would be 2.3

There’s something about the number 2.3. It just rolls off the tongue with such an easy rectitude. Western families reportedly average 2.3 children; there were 2.3 million Americans out of work when Barrack Obama took Office; Starbucks go through 2.3 million paper cups a year. But the 2.3 that resonates with me most is 2.3 billion. That was the world population in the late 1940’s, and growing. WWII was over and we were finally able to stand up, dust off the despair of war and Depression, bask in a renewed confidence in the future, and make a lot of babies. We were on the brink of something and what those babies didn’t know yet was that they would grow up to usher in a wave of unprecedented social, economic and technological change.

We are on the brink again. Open data is gaining momentum faster than the Baby Boomers are growing old  and it has the potential to steer that wave of change in all manner of directions. We’re ready for the next 2.3. Enter CKAN 2.3.

Here are some of the most exciting updates:

  • Completely refactored resource data visualizations, allowing multiple persistent views of the same data an interface to manage and configure them. Check the updated documentation to know more, and the “Changes and deprecations” section for migration details: http://docs.ckan.org/en/ckan-2.3/maintaining/data-viewer.html

  • Responsive design for the default theme, that allows nicer rendering across different devices

  • Improved DataStore filtering and full text search capabilities

  • Added new extension points to modify the DataStore behaviour

  • Simplified two-step dataset creation process

  • Ability for users to regenerate their own API keys

  • Changes on the authentication mechanism to allow more secure set-ups. See “Changes and deprecations” section for more details and “Troubleshooting” for migration instructions.

  • Better support for custom dataset types

  • Updated documentation theme, now clearer and responsive

If you are upgrading from a previous version, make sure to check the “Changes and deprecations” section in the CHANGELOG, specially regarding the authorization configuration and data visualizations.

To install the new version, follow the relevant instructions from the documentation depending on whether you are using a package or source install: http://docs.ckan.org/en/ckan-2.3/maintaining/installing/index.html

If you are upgrading an existing CKAN instance, follow the upgrade instructions: http://docs.ckan.org/en/ckan-2.3/maintaining/upgrading/index.html

We have also made available patch releases for the 2.0.x, 2.1.x and 2.2.x versions. It is important to apply these, as they contain important security and stability fixes. Patch releases are fully backwards compatible and really easy to install: http://docs.ckan.org/en/latest/maintaining/upgrading/upgrade-package-to-patch-release.html

Charting the CKAN boom.

The following graph charts population from 1800 to 2100 but we’re interested in the period from the mid-1940s when there was a marked boost in population growth.

World population estimates from 1800 to 2100

World population estimates from 1800 to 2100. Sourced from Wikipedia: http://en.wikipedia.org/wiki/World_population The growth from 2.3 Billion in the 1940s is the Boom!

With the recent release of CKAN 2.3 we’re expecting a similar boost in community contributions. To add your voice to the community and boost the profile of the CKAN project please share a picture on twitter and include the hashtag #WeAreCKAN.

cooltext115409351606537

March 02 2015

04:58

The CKAN Association: Membership has its benefits.

The CKAN Association, established in 2014, is set to grow rapidly in 2015 with a number of initiatives now being planned to attract free tier Supporter members as well as paid members for the Gold, Silver and Bronze tiers.

The newly established Community and Communication Team (C&C Team) is recruiting members now via their Google Group at: https://groups.google.com/forum/?hl=en-GB#!forum/ckan-association-community-and-communication-team-group

The team needs your help with website updates, creative content development and community engagement. As a new team within the CKAN project they are looking for self motivated people to initially join a core team that will set the strategic communication objectives for the project and help to realise the incredible potential of the CKAN project.

If you can contribute as little as one or two hours per week then you’ll earn yourself a CKAN Association supporter badge, but that is just the start… by joining the C&C Team you’ll be in the middle of things and help to grow a worldwide community of awesomeness.

CKAN Association Badges

The following CKAN Association badges are now available. If you are already a member of the Tech Team then you can request to grab the Supporter Member badge via the C&C Team google group

Badge files and usage policy will be available on CKAN.org soon <- This is one of the todo items the C&C Team are recruiting help for!

The current list of CKAN Association members can be found here: http://ckan.org/about/members/

CKAN Association Badges

February 27 2015

12:57

AKSW Colloquium: Tommaso Soru and Martin Brümmer on Monday, March 2 at 3.00 p.m.

On Monday, 2nd of March 2015, Tommaso Soru will present ROCKER, a refinement operator approach for key discovery. Martin Brümmer will then present NIF annotation and provenance – A comparison of approaches.

Tommaso Soru – ROCKER – Abstract

As within the typical entity-relationship model, unique and composite keys are of central importance also when their concept is applied on the Linked Data paradigm. They can provide help in manifold areas, such as entity search, question answering, data integration and link discovery. However, the current state of the art does not count approaches able to scale while relying on a correct definition of key. We thus present a refinement-operator-based approach dubbed ROCKER, which has shown to scale to big datasets with respect to the run time and the memory consumption. ROCKER will be officially introduced at the 24th International Conference on World Wide Web.

Tommaso Soru, Edgard Marx, and Axel-Cyrille Ngonga Ngomo, “ROCKER – A Refinement Operator for Key Discovery”. [PDF]

Martin Brümmer - Abstract – NIF annotation and provenance – A comparison of approaches

The uptaking use of the NLP Interchange Format (NIF) reveals its shortcomings on a number of levels. One of these is tracking metadata of annotations represented in NIF – which NLP tool added which annotation with what confidence at which point in time etc.

A number of solutions to this task of annotating annotations expressed as RDF statements has been proposed over the years. The talk will weigh these solutions, namely annotation resources, reification, Open Annotation, quads and singleton properties in regard to their granularity, ease of implementation and query complexity.

The goal of the talk is presenting and comparing viable alternatives of solving the problem at hand and collecting feedback on how to proceed.

February 20 2015

10:00

SEMANTiCS2015: Calls for Research & Innovation Papers, Industry Presentations and Poster/Demos are now open!

The SEMANTiCS2015 conference comes back this year in its 11th edition where it all started in 2005 to Vienna, Austria!

The conference  takes place from 15-17 September 2015 (the main conference will be on 16-17th of September and several back 2 back workshops & events on 15th) at the University of Economics – see all information: http://semantics.cc/.

SEMANTiCS 2015 - Banner - new

We are happy to announce the SEMANTiCS Open Calls as follows. All infos on the Calls can also be found on the SEMANTiCS2015 website here: http://semantics.cc/open-calls

Call for Research & Innovation Papers

The Research & Innovation track at SEMANTiCS welcomes the submission of papers on novel scientific research and/or innovations relevant to the topics of the conference. Submissions must be original and must not have been submitted for publication elsewhere. Papers should follow the ACM ICPS guidelines for formatting (http://www.acm.org/sigs/publications/proceedings-templates) and must not exceed 8 pages in lenght for full papers and 4 pages for short papers, including references and optional appendices.

Abstract Submission Deadline: May 22, 2015
Paper Submission Deadline: May 29, 2015
Notification of Acceptance: July 10, 2015
Camera-Ready Paper: July 24, 2015
Details: http://bit.ly/semantics15-research

Call for Industry & Use Case Presentations

To address the needs and interests of industry SEMANTICS presents enterprise solutions that deal with semantic processing of data and/or information in areas like like Linked Data, Data Publishing, Semantic Search, Recommendation Services, Sentiment Detection, Search Engine Add-Ons, Thesaurus and/or Ontology Management, Text Mining, Data Mining and any related fields. All submissions have a strong focus on real world applications beyond the prototypical status and demonstrate the power of semantic systems!

Submission Deadline: July 1, 2015
Notification of Acceptance: July 20, 2015
Presentation Ready: August 15, 2015
Details: http://bit.ly/semantics15-industry

Call for Posters and Demos

The Posters & Demonstrations Track invites innovative work in progress, late-breaking research and innovation results, and smaller contributions (including pieces of code) in all fields related to the broadly understood Semantic Web. The informal setting of the Posters & Demonstrations Track encourages participants to present innovations to business users and find new partners or clients.  In addition to the business stream, SEMANTiCS 2015 welcomes developer-oriented posters and demos to the new technical stream.

Submission Deadline: June 17, 2015
Notification of Acceptance: July 10, 2015
Camera-Ready Paper: August 01, 2015
Details: http://bit.ly/semantics15-poster

We are looking forward to receive your submissions for SEMANTiCS2015 and see you in Vienna in autumn!

February 19 2015

21:53

AKSW Colloquium: Edgard Marx and Tommaso Soru on Monday, February 23, 3.00 p.m.

On Monday, 23rd of February 2015, Edgard Marx will introduce Smart, a search engine designed over the Semantic Search paradigm; subsequently, Tommaso Soru will present ROCKER, a refinement operator approach for key discovery.

EDIT: Tommaso Soru’s presentation was moved to March 2nd.

Abstract – Smart

Since the conception of the Web, search engines play a key role in making content available. However, retrieving of the desire information is still significantly challenging. Semantic Search systems are a natural evolution of the traditional search engines. They promise more accurate interpretation by understanding the contextual meaning of the user query. In this talk, we will introduce our audience to Smart, a search engine designed over the Semantic Search paradigm. Smart incorporates two of our currently designed approaches of dealing with the problem of Information Retrieval, as well as a novel interface paradigm. Moreover, we will present some of the former, as well as more recent state-of-the-art approaches used by the industry – for instance by Yahoo!, Google and Facebook.

Abstract – ROCKER

As within the typical entity-relationship model, unique and composite keys are of central importance also when their concept is applied on the Linked Data paradigm. They can provide help in manifold areas, such as entity search, question answering, data integration and link discovery. However, the current state of the art does not count approaches able to scale while relying on a correct definition of key. We thus present a refinement-operator-based approach dubbed ROCKER, which has shown to scale to big datasets with respect to the run time and the memory consumption. ROCKER will be officially introduced at the 24th International Conference on World Wide Web.

Tommaso Soru, Edgard Marx, and Axel-Cyrille Ngonga Ngomo, “ROCKER – A Refinement Operator for Key Discovery”. [PDF]

February 18 2015

13:04

Data to Value & Semantic Web Company agree partnership to bring cutting edge Semantic Management to Financial Services clients

The partnership aims to change the way organisations, particularly within Financial Services, manage the semantics embedded in their data landscapes. This will offer several core benefits to existing and prospective clients including locating, contextualising and understanding the meaning and content of Information faster and at a considerably lower cost. The partnership will achieve this through combining the latest Information Management and Semantic techniques including:

  • Text Mining, Tagging, Entity Definition & Extraction.
  • Business Glossary, Data Dictionary & Data Governance techniques.
  • Taxonomy, Data Model and Ontology development.
  • Linked Data & Semantic Web analyses.
  • Data Profiling, Mining & Discovery.

This includes improving regulatory compliance in areas such as BCBS, enabling new investment research and client reporting techniques as well as general efficiency drivers such as faster integration of mergers and acquisitions. As part of the partnership, Data to Value Ltd. will offer solution services and training in PoolParty product offerings, including ontology development and data modeling services.

Nigel Higgs, Managing Director of Data to Value notes; “this is an exciting collaboration between two firms which are pushing the boundaries in the way Data, Information and Semantics are managed by business stakeholders. We spend a great deal of time helping organisations at a grass roots level pragmatically adopt the latest Information Management techniques. We see this partnership as an excellent way for us to help organisations take realistic steps to adopting the latest semantic techniques.”

Andreas Blumauer, CEO of Semantic Web Company adds, “The consortium of our two companies offers a unique bundle, which consists of a world-class semantic platform and a team of experts who know exactly how Semantics can help to increase the efficiency and reliability of knowledge intensive business processes in the financial industry.”

February 17 2015

14:38

Call for Feedback on LIDER Roadmap

The LIDER project is gathering feedback on a roadmap for the use of Linguistic Linked Data for content analytics.  We invite you to give feedback in the following ways:

Excerpt from the roadmap

Full document: available here
Summary slides: available here

Content is growing at an impressive, exponential rate. Exabytes of new data are created every single day. In fact, data has been recently referred to as the “oil” of the new economy, where the new economy is understood as “a new way of organizing and managing economic activity based on the new opportunities that the Internet provided for businesses” .

Content analytics, i.e. the ability to process and generate insights from existing content, plays and will continue to play a crucial role for enterprises and organizations that seek to generate value from data, e.g. in order to inform decision and policy making.

As corroborated by many analysts, substantial investments in technology, partnerships and research are required to reach an ecosystem consisting of many players and technological solutions that provide the necessary infrastructure, expertise and human resources required to make sure that organizations can effectively deploy content analytics solutions at large scale in order to generate relevant insights that support policy and decision making, or even to define completely new business models in a data-driven economy.

Assuming that such investments need to be and will be made, this roadmap explores the role that linked data and semantic technologies can and will play in the field of content analytics and will generate a set of recommendations for organizations, funders and researchers on which technologies to invest as a basis to prioritize their investment in R&D as well as on optimizing their mid- and long-term strategies and roadmaps.

Conference Call on 19th of February 3 p.m. CET

Connection details: https://www.w3.org/community/ld4lt/wiki/Main_Page#LD4LT_calls
Summary slides: available here

Agenda

  1. Introduction to the LIDER Roadmap (Philipp Cimiano, 10 minutes)
  2. Discussion of Global Customer Engagement Use Cases (All, 10 minutes)
  3. Discussion of Public Sector and Civil Society Use Cases (All, 10 minutes)
  4. Discussion of Linked Data Life Cycle and Linguistic Linked Data Value Chain (All, 10 minutes)
  5. General Discussion on further use cases, items in the roadmap etc. (20 minutes)

In addition, the call will briefly discuss progress of meta-share linked data metadata model.

The call is open to the public, no LD4LT group participation is required. Dial-in information is available. Please spread this information widely. No knowledge about linguistic linked data is required. We especially are interested in feedback from potential users of linguistic linked data.

About the LIDER Project

Website: http://lider-project.eu

The project’s mission is to provide the basis for the creation of a Linguistic Linked Data cloud that can support content analytics tasks of unstructured multilingual cross-media content. By achieving this goal, LIDER will impact on the ease and efficiency with which Linguistic Linked Data will be exploited in content analytics processes.

We aim at providing an ecosystem for the establishment of a new Linked Open Data (LOD) based ecosystem of free, interlinked, and semantically interoperable language resources (corpora, dictionaries, lexical and syntactic metadata, etc.) and media resources (image, video, etc. metadata) that will allow for free and open exploitation of such resources in multilingual, cross-media content analytics across the EU and beyond, with specific use cases in industries related to social media, financial services, localization, and other multimedia content providers and consumers.

Take a personal interview to include your voice into the roadmap

Contact: http://lider-project.eu/?q=content/contact-us

The EU project LIDER has been tasked by the European Commission to put together a roadmap for future R&D funding in multilingual industries such as content and knowledge localization, multilingual terminology and taxonomy management, cross-border business intelligence, etc. As a leading supplier of solutions in one or more of these industries, we would need your input for this roadmap. We would like to conduct a short interview with you to establish your views on current and developing R&D efforts in multilingual and semantic technologies that will likely play an increasing role in these industries, such as Linked Data and related standards for web-based, multilingual data processing. The interview will cover the below 5 questions and will not take more than 30 minutes. Please let us know on a suitable time and date.

February 16 2015

12:49

Data to Value & Semantic Web Company agree partnership to bring cutting edge Semantic Management to Financial Services clients

As part of the partnership, Data to Value will offer solution services and training in PoolParty product offerings, including ontology development and data modeling services.

data-to-value-logoThe partnership aims to change the way organisations, particularly within Financial Services, manage the semantics embedded in their data landscapes. This will offer several core benefits to existing and prospective clients including locating, contextualising and understanding the meaning and content of Information faster and at a considerably lower cost. The partnership will achieve this through combining the latest Information Management and Semantic techniques including:

  • Text Mining, Tagging, Entity Definition & Extraction.
  • Business Glossary, Data Dictionary & Data Governance techniques.
  • Taxonomy, Data Model and Ontology development.
  • Linked Data & Semantic Web analyses.
  • Data Profiling, Mining & Discovery.


A particular focus of the partnership will be to bring new approaches to the Financial Services sector where both parties believe significant opportunities exist for the adoption of semantic technologies and techniques.

This includes improving regulatory compliance in areas such as BCBS, enabling new investment research and client reporting techniques as well as general efficiency drivers such as faster integration of mergers and acquisitions.

Nigel Higgs, Managing Director of Data to Value notes; “this is an exciting collaboration between two firms which are pushing the boundaries in the way Data, Information and Semantics are managed by business stakeholders. We spend a great deal of time helping organisations at a grass roots level pragmatically adopt the latest Information Management techniques. We see this partnership as an excellent way for us to help organisations take realistic steps to adopting the latest semantic techniques.”

Andreas Blumauer, CEO of Semantic Web Company adds, “The consortium of our two companies offers a unique bundle, which consists of a world-class semantic platform and a team of experts who know exactly how Semantics can help to increase the efficiency and reliability of knowledge intensive business processes in the financial industry.”

12:45

AKSW Colloquium: Konrad Höffner and Michael Röder on Monday, February 16, 3.00 p.m.

CubeQA—Question Answering on Statistical Linked Data by Konrad Höffner

Abstract

Question answering systems provide intuitive access to data by translating natural language queries into SPARQL, which is the native query language of RDF knowledge bases. Statistical data, however, is structurally very different from other data and cannot be queried using existing approaches. Building upon a question corpus established in previous work, we created a benchmark for evaluating questions on statistical Linked Data in order to evaluate statistical question answering algorithms and to stimulate further research. Furthermore, we designed a question answering algorithm for statistical data, which covers a wide range of question types. To our knowledge, this is the first question answering approach for statistical RDF data and could open up a new research area.
See also the paper (preprint, under review) and the slides.

News from the WSDM 2015 by Michael Röder

Abstract

The WSDM conference is one of the major conferences for Web Search and Data Mining. Michael Röder was attending this years WSDM conference in Shanghai and wants to present a short overview over the conference topics. After that, he wants to take a closer look at FEL – an entity linking approach for search queries peresented at the conference.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

11:50

Kick-off of the FREME project

Hi all !

A new InfAI project, FREME, kicked off in Berlin. FREME – Open Framework of E-Services for Multilingual and Semantic Enrichment of Digital Content is an H2020 funded project with the objective of building an open, innovative, commercial-grade framework of e-services for multilingual and semantic enrichment of digital content.

InfAI will play an important role in FREME by driving two of the six central FREME services, e-Link and the e-Entity. NIF will be used as a mediator between language services and data sources, serving as foundation for e-Link, while DBpedia Spotlight will be a prototype for e-Entity services, linking named entities in natural language texts to Linked Open Data sets like DBpedia.

InfAI will also help to identify and publish new Linked Data sets that can contribute to data value chains. Our partners in this open content enrichment effort will be DFKI, Tilde, Iminds, Agro-Know, Wripl, VistaTEC and ISBM.

Stay tuned for more info ! In the meanwhile join the conversation on twitter #FREMEH2020.

- Amrapali Zaveri on behalf of the NLP2RDF group

February 13 2015

09:38

DL-Learner 1.0 (Supervised Structured Machine Learning Framework) Released

Dear all,

we are happy to announce DL-Learner 1.0.

DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. DL-Learner can use various RDF and OWL serialization formats as well as SPARQL endpoints as input, can connect to most popular OWL reasoners and is easily and flexibly configurable. It extends concepts of Inductive Logic Programming and Relational Learning to the Semantic Web in order to allow powerful data analysis.

Website: http://dl-learner.org
GitHub page: https://github.com/AKSW/DL-Learner
Download: https://github.com/AKSW/DL-Learner/releases
ChangeLog: http://dl-learner.org/development/changelog/

DL-Learner is used for data analysis in other tools such as ORE and RDFUnit. Technically, it uses refinement operator based, pattern based and evolutionary techniques for learning on structured data. For a practical example, see http://dl-learner.org/community/carcinogenesis/. It also offers a plugin for Protege, which can give suggestions for axioms to add. DL-Learner is part of the Linked Data Stack – a repository for Linked Data management tools.

We want to thank everyone who helped to create this release, in particular (alphabetically) An Tran, Chris Shellenbarger, Christoph Haase, Daniel Fleischhacker, Didier Cherix, Johanna Völker, Konrad Höffner, Robert Höhndorf, Sebastian Hellmann and Simon Bin. We also acknowledge support by the recently started SAKE project, in which DL-Learner will be applied to event analysis in manufacturing use cases, as well as the GeoKnow and Big Data Europe projects where it is part of the respective platforms.

Kind regards,

Lorenz Bühmann, Jens Lehmann and Patrick Westphal

09:10

Writing a Survey – Steps, Advantages, Limitations and Examples

What is a Survey?

A survey or systematic literature review is a text of a scholarly paper, which includes the current knowledge including substantive findings, as well as theoretical and methodological contributions to a particular topic. Literature reviews use secondary sources, and do not report new or original experimental work [1].

A systematic review is a literature review focused on a research question, trying to identify, appraise, select and synthesize all high quality research evidence and arguments relevant to that question. Moreover, a literature review is comprehensive, exhaustive and repeatable, that is, the readers can replicate or verify the review.

Steps to perform a survey

  • Select two independent reviewers

  • Look for related/existing surveys

    • If it exists, see how long back it was done. If it was 10 years ago, you can go ahead and update it.

  • Formulate research questions

  • Devise eligibility criteria

  • Define search strategy – keywords, journals, conferences, workshops to search in

  • Retrieve further potential article using search strategy and also directly contacting top researchers in the field

  • Compare chosen articles among reviewers and decide a core set of papers to be included in the survey

  • Perform Qualitatively and Quantitatively on the selected set of papers

  • Report on the results

Advantages of writing a survey

There are several benefits/advantages of conducting a survey, such as:

  • A survey is the best way to get an idea of the state-of-the-art technologies, algorithms, tools etc. in a particular field

  • One can get a clear birds-eye overview of the current state of that field

  • It can serve as a great starting point for a student or any researcher thinking of venturing into that particular field/area of research

  • One can easily acquire updated information of a subject by referring to a review

  • It gives researchers the opportunity to formalize different concepts of a particular field

  • It allows one to identify challenges and gaps that are unanswered and crucial for that subject

Limitations of a survey

However, there are a few limitations that must be considered before undertaking a survey such as:

  • Surveys can tend to be biased, thus it is necessary to have two researchers, who perform the systematic search for the articles independently

  • It is quite challenging to unify concepts, especially when there are different ideas referring to the same concepts developed over several years

  • Indeed, conducting a survey and getting the article published is a long process

Surveys conducted by members of the AKSW group

In our group, three students conducted comprehensive literature reviews on three different topics:

  • Linked Data Quality: The survey covers 30 core papers, which focus on providing quality assessment methodologies for Linked Data specifically. A total of 18 data quality dimensions along with their definitions and 69 metrics are provided. Additionally, the survey contributes a comparison of 12 tools, which perform quality assessment of Linked Data [2].

  • Ubiquitous Semantic Applications: The survey presents a thorough analysis of 48 primary studies out of 172 initially retrieved papers.  The results consist of a comprehensive set of quality attributes for Ubiquitous Semantic Applications together with corresponding application features suggested for their realization. The quality attributes include aspects such as mobility, usability, heterogeneity, collaboration, customizability and evolvability. The proposed quality attributes facilitate the evaluation of existing approaches and the development of novel, more effective and intuitive Ubiquitous Semantic Applications [3].

  • User interfaces for semantic authoring of textual content: The survey covers a thorough analysis of 31 primary studies out of 175 initially retrieved papers. The results consist of a comprehensive set of quality attributes for SCA systems together with corresponding user interface features suggested for their realization. The quality attributes include aspects such as usability, automation, generalizability, collaboration, customizability and evolvability. The proposed quality attributes and UI features facilitate the evaluation of existing approaches and the development of novel more effective and intuitive semantic authoring interfaces [4].

Also, here is a presentation on “Systematic Literature Reviews”: http://slidewiki.org/deck/57_systematic-literature-review.

References

[1] Lisa A. Baglione (2012) Writing a Research Paper in Political Science. Thousand Oaks: CQ Press.

[2] Amrapali Zaveri, Anisa Rula, Andrea Maurino, Ricardo Pietrobon, Jens Lehmann and Sören Auer (2015), ‘Quality Assessment for Linked Data: A Survey’, Semantic Web Journal. http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

[3] Timofey Ermilov, Ali Khalili, and Sören Auer (2014). ;Ubiquitous Semantic Applications: A Systematic Literature Review’. Int. J. Semant. Web Inf. Syst. 10, 1 (January 2014), 66-99. DOI=10.4018/ijswis.2014010103 http://dx.doi.org/10.4018/ijswis.2014010103

[4] Ali Khalili and Sören Auer (2013). ‘User interfaces for semantic authoring of textual content: A systematic literature review’, Web Semantics: Science, Services and Agents on the World Wide Web, Volume 22, October 2013, Pages 1-18 http://www.sciencedirect.com/science/article/pii/S1570826813000498

February 10 2015

15:56
Customer relations: From interaction to conversation

February 03 2015

10:39

Kick-Off for the BMWi project SAKE

Hi all!

One of AKSW’s Big Data Project, SAKE – Semantische Analyse Komplexer Ereignisse (SAKE – Semantic Analysis of Complex Events) kicked-off in Karlsruhe. SAKE is one of the winners of the Smart Data Challenge and is funded by the German BMWi (Bundesministerium für Wirtschaft und Energie) and has a duration of 3 years. Within this project, AKSW will develop powerful methods for analysis of industrial-scale Big Linked Data in real time. To this end, the team will extend existing frameworks like LIMES, QUETSAL and FOX. Together with USU AG, Heidelberger Druckmaschinen, Fraunhofer  IAIS and AviComp Controls novel methods for tackling Business Intelligence challenges will be devised.

More info to come soon!

Stay tuned!

Axel on behalf of the SAKE team

February 02 2015

11:00

AKSW Colloquium: Ricardo Usbeck and Ivan Ermilov on Monday, February 2, 3.00 p.m.

GERBIL – General Entity Annotation Benchmark Framework by Ricardo Usbeck

Abstract

The need to bridge between the unstructured data on the document Web and the structured data on the Data Web has led to the development of a considerable number of annotation tools. Those tools are hard to compare since published results are calculated on diverse datasets and measured in different units.

We present GERBIL, a general entity annotation system based on the BAT-Framework. GERBIL offers an easy-to-use web-based platform for the agile comparison of annotators using multiple datasets and uniform measuring approaches. To add a tool to GERBIL, all the end user has to do is to provide a URL to a REST interface to its tool which abides by a given specification. The integration and benchmarking of the tool against user-specified datasets is then carried out automatically by the GERBIL platform. Currently, out platform provides results for 9 annotators and 11 datasets with more coming. Internally, GERBIL is based on the Natural Language Programming Interchange Format (NIF) and provide Java classes for implementing APIs for datasets and annotators to NIF. For the paper see here.

Towards Efficient and Effective Semantic Table Interpretation by Ziqi Zhang presented by Ivan Ermilov

Abstract

Ivan will present a paper that describes TableMiner by Ziqi Zhang, the first semantic Table Interpretation method that adopts an incremental, mutually recursive and bootstrapping learning approach seeded by automatically selected ‘partial’ data from a table. TableMiner labels columns containing named entity mentions with semantic concepts that best describe data in columns, and disambiguates entity content cells in these columns. TableMiner is able to use various types of contextual information outside tables for Table Interpretation, including semantic markups (e.g., RDFa/microdata annotations) that to the best of our knowledge, have never been used in Natural Language Processing tasks. Evaluation on two datasets shows that compared to two baselines, TableMiner consistently obtains the best performance. In the classification task, it achieves significant improvements of between 0.08 and 0.38 F1 depending on different baseline methods; in the disambiguation task, it outperforms both baselines by between 0.19 and 0.37 in Precision on one dataset, and between 0.02 and 0.03 F1 on the other dataset. Observation also shows that the bootstrapping learning approach adopted by TableMiner can potentially deliver computational savings of between 24 and 60% against classic methods that ‘exhaustively’ processes the entire table content to build features for interpretation.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

January 20 2015

15:09

Two AKSW Papers at #WWW2015 in Florence, Italy

Hello Community!
We are very pleased to announce that two of our papers were accepted for presentation at WWW 2015.  The papers cover novel approaches for Key Discovery while Linking Ontologies and a benchmark framework for entity annotation systems. In more detail, we will present the following papers:
Visit us from the 18th to the 22nd May in Florence, Italy and enjoy the talks. More information on these publications at http://aksw.org/Publications.
Cheers,
Ricardo on behalf of AKSW

January 10 2015

18:35

Introducing PoolParty 5

Five more reasons to lean on a world-class semantic platform

Semantic Web Company has just released a new version of PoolParty Semantic Suite. Apart from minor improvements and fixes, users of PoolParty 5 benefit from the following new features and major improvements: Highly precise entity extraction, deep integration with SharePoint & Drupal, fully integrated web crawler, refined look & feel, and full-blown ontology management & semantic reasoning.

PoolParty 5 - new features

Highly precise entity extraction: In addition to the thesaurus based entity extraction service and PoolParty’s language-independent free term & phrase extractor, which has been introduced a while ago, PoolParty now offers a configurable and rule-based disambiguation service. The service makes use of context information about ambiguous entities which is derived from the knowledge model (thesaurus).

Deep integration with SharePoint, Drupal & Confluence: PoolParty’s APIs offer great opportunities to exploit semantic technologies in various content platforms (see: Integrate your CMS with PoolParty). Since PoolParty’s previous major release, several modules have been (further) developed to make use of (semi-)automatic tagging, content classification, semantic search, and content recommendation to extend the functionality of popular content and collaboration platforms like SharePoint, Confluence or Drupal. These products are ready to be used: Make use of semantics now!

Fully integrated web crawler: A basic principle of PoolParty Semantic Suite is the support of learning loops when developing semantic knowledge graphs, which should reflect the underlying information and content base as exactly as possible. Taxonomists should receive as much tool support as possible when creating and maintaining controlled vocabularies and linked data graphs. Therefore, PoolParty 5 now has several ways to analyse various content streams, e.g. web crawling (with a configurable depth), content harvesting from Wikipedia/DBpedia, or RSS feeds. The resulting candidate terms can be introduced into vocabularies with ease to extend the knowledge model step by step.

Refined look & feel: Taxonomy Management and Ontology Engineering can be a tedious process. From version 1, PoolParty has been following the philosophy to make things as simple as possible, but not simpler. User experience and interface design is a central component of this approach. PoolParty 5 got an extensive redesign, and we are sure that you will like it!

Full-blown ontology management & semantic reasoning: PoolParty-based knowledge modeling is not restricted to SKOS only, but it most often starts with it. To extend the ontology behind SKOS, PoolParty editor provides easy-to-learn dialogues. You can import widely used ontologies from the included library to reuse them, e.g. schema.org, FOAF, or vCard. But you can also extend and remix these ontologies in order to create new custom schemas. With this means, taxonomists can enrich any node of a vocabulary with specific relations and attributes to extend it further to a full-blown semantic knowledge graph. PoolParty’s built-in rule engine will make sure, that your knowledge graph remains consistent at any time.

Sounds interesting?

Register for the PoolParty 5 Webinar!

In this webinar you will get a perfect overview over the new features of PoolParty version 5. Additionally, we will give live demos of linked data applications based on PoolParty Semantic Platform. For example, we will demonstrate a brand-new semantic search application, which is fully based on an RDF graph store and the standards-based query language SPARQL.

Wed, Feb 25, 2015 10:30 AM – 11:30 AM EST

Register now for free

January 07 2015

20:12

DBpedia Usage Report

We've just published the latest DBpedia Usage Report, covering v3.3 (released July, 2009) to v3.9 (released September, 2013); v3.10 (sometimes called "DBpedia 2014"; released September, 2014) will be included in the next report.

We think you'll find some interesting details in the statistics. There are also some important notes about Virtuoso configuration options and other sneaky technical issues that can surprise you (as they did us!) when exposing an ad-hoc query server to the world.

January 05 2015

13:54
Build customer interaction for tomorrow
Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.