The World Wide Web is the product of a vision developed long before the current popularity of the web. The web is a place where people can deposit content and also make links between the content on one web page and the content on other pages, and between the content of one site and the content of another. In its present state the World Wide Web is a very disorganised collection of data, information and knowledge where retrieval is largely a manual process. Phenomenal growth in content on the World Wide Web is partly responsible for the chaotic situation that now prevails.

A new vision of the World Wide Web has recently been formulated. This new vision for the web is referred to as The Semantic Web. This is essentially seen as an information resource where there is more structure and it becomes possible to easily extract information and knowledge from content that is globally distributed.

The construction of the Semantic Web requires developments that will enable web content to be better organised and classified, as well as the application of knowledge-based technologies that will help to automate the collection of information and the extraction of knowledge from web content. The intention is to transform the web into an efficient information and knowledge source and to enable the development of value added services based on the web.

Main Issues

The computer was invented as a device for computation, but it has now become a truly universal machine for it also provide a means of entertainment as well as an entry point to a world wide network of information exchange. A technology is now needed that supports access to unstructured and heterogeneous distributed information and knowledge sources. This technology is called the Semantic Web.

A major problem is extracting useful knowledge from information found on the web. Everyone is facing a deluge of data and information. The volume is overwhelming and there is a need to be able to extract useful knowledge from this. However there are a number of important issues that need to be addressed. The first of these is the development of shared vocabularies (also known as ontologies). These will provide the basis of a common understanding of the meaning of words used in different applications. Without such agreed vocabularies the present chaos will continue. Trust is also importance - knowing that data, information and knowledge derives from reliable sources. This is a major difficulty that will need to be resolved if the vision of value added services based on the Semantic Web are to become a reality. Content also needs to be annotated, preferably by automated means, if knowledge-based services are to be delivered based on web content.

A number of activities have been initiated by the World Wide Web Consortium (W3C) that are directed towards the development of the Semantic Web. Their Semantic Web activity is based upon evolving the current World Wide Web into something that better supports automation. The key tools for this are ontologies and resource description framework. The aim of W3C is for the semantics of ontologies to be defined by user communities. Resource description framework provides a generic means of describing resources of any kind. The resource description framework makes use of Extensible Mark-up Language (XML), and it also provides a means of helping to automate aspects of using the web.

The semantic web activity in W3C centres around working and interest groups looking at issues such as resource description framework, development of specifications, and achieving better integration of resource description framework with web services and Extensible Mark-up Language. A web ontology group is working to produce a more sophisticated, richer and expressive language through which user communities can expose the more detailed semantics of their onologies.

A difficulty is the creating ontologies as these need to be consensual. This is sometimes routine, but sometimes very hard depending on the situation. At the moment there are no good design guidelines for developing onologies. The key issue is capturing the rationale for representing the world in a particular way. The question of how to maintain ontologies in areas where rapid changes and developments are taking place is also a major problem, one for which there is as yet no solutions.

Maintenance of the content is also an important issue. This is often addressed as an afterthought but is in fact a central issue. There is a need to acquire knowledge with a view to future maintenance. Not enough attention was paid to this subject.

The issue of how to deal with both general and specific knowledge needs to be addressed. One way forward is to create high level abstract ontologies, and then to specialise these to cases. However there is a view that onologies are task dependent and that mapping between ontlogies or merging ontologies is hard. There is also no single best ontology. It is unlikely that there will be single uniquely accepted ontologies.

Conclusions and Future Directions

The creation of the Semantic Web provides an opportunity for technologies to be used to serve people. Application of technologies offers the potential to make the web easier to use, more user friendly and to automate the extraction of knowledge from the data and information on the web. The challenges that lie ahead include the construction of shared vocabularies, the development of automated means of annotating web content, especially legacy content from current web sites, dealing with the authentication of content (a question of quality and trust) and maintenance of content. A major challenge is using knowledge technologies to create a more automated approach to using the web and extracting knowledge from the vast resource that is the World Wide Web. Ensuring the quality of the knowledge provided is a major issue to be resolved if knowledge-based web services are to be widely accepted.


