7. Perspectives and conclusion

To conclude this dissertation, I will propose a number of directions in which the works presented therein can be improved, continued and extended.

7.1. Convergences

Several of the works presented in this dissertations have pioneered the use of Web technologies in fields were they were not considered at the time. The modelling of interaction traces with RDF allowed us to elicit the semantics of the constituents of the traces, and to share reusable trace models and transformations. As for video annotations, we experimented with HTML-based hypervideos before video became a first-class citizen of the Web. In the meantime, those use-cases have gained traction, and W3C standards have emerged to support them. It is therefore necessary to re-evaluate our original proposals in the light of those standards, the evolution of technologies, and the evolution of uses.

As mentioned in the end of Section 4.1, we have already started this critical work (Cazenave-Lévêque 2016) comparing our modelled-trace meta-model in particular to PROV (Lebo, Satya Sahoo, and McGuinness 2013) and Activity Streams (Snell and Prodromou 2016). While each has its own focus, they both address the problem of representing some agents’ activity. More work is required to align our meta-model with those standards, especially when it comes to capturing the context of the activity. However, once this is done, we could leverage PROV traces and Activity Streams using Trace Based Reasoning (see Section 2.4). Since both formats are being increasingly adopted for representing traces in in various domains, but provide no standard mean for processing or tapping these traces, this will create a number of stimulating opportunities for applying and refining TBR.

The same goes for video annotations: the Web Annotation Data Model (Sanderson, Ciccarese, and Young 2016), combined with Media Fragment URIs (Troncy et al. 2012), now provides a standard replacement for Cinelab annotations (Section 5.1). However, the Cinelab model goes beyond the scope of Web Annotations, proposing to describe and share reusable annotation structures (annotation models), as well as hypervideos based on those annotations (views). We plan to adapt those concepts to the Web Annotation standard. In particular, although a number of popular Javascript toolkits have been proposed to produce rich HTML5-based hypervideos, a more declarative approach would ease the authoring and reuse of hypervideo designs. Our proposals in that direction based on Cinelab (Sadallah, Aubert, and Prié 2014; Steiner et al. 2015) ought to be adapted to the new Web Annotation model.

7.2. Towards ambivalence-aware intelligent applications

Winograd (2006) suggests that the perceived opposition between AI and Human Computer Interaction (HCI) is actually a deeper opposition between what he calls the rationalistic approach (putting emphasis on data and knowledge processing) and the design approach (putting emphasis on interactions). He argues that a balance must be found between those two trends, both in AI and in HCI, and that both fields may not differ so much once the right balance is found.

I believe that the proposal in Chapter 6 may be helpful in finding that balance. In the end of of Section 6.4, we have emphasized the tight relationship between semantics and interactions, and proposed that the notion of congruence can help design interfaces and visualizations supporting multiple interpretations. This will of course require to investigate more deeply the implications of the proposed framework. At a theoretical level, we must precisely define the desirable formal properties that the notion of congruence can bring into a system. This will then allow us, at a practical level, to provide guidelines for designing ambivalence-aware systems and user interfaces, and experimentally assess their added-value compared to more traditional systems.

More recently, another opposition has been dividing AI, which we could label top-down versus bottom-up. The former approach focuses on knowledge engineering, and interpreting data through carefully designed models, while the latter focuses on data mining and machine learning, leaving semantics to emerge from the data. Elated by the spectacular successes of deep-learning techniques (Johnson 2016; Williams 2016), some people have pronounced the definitive victory of the bootom-up approach. Wu (2013) claims that “having more data allows the ‘data to speak for itself’, instead of relying on unproven assumptions and weak correlations”. This extreme position is neglecting the biases introduced by how data is collected, represented, visualized and more generally, made to “speak”. Surely, both approaches are valuable and can complement each other as long as they leave room for ambivalence, rather than aiming for an illusory unique objective meaning.

Existing approaches have been successful for big organizations and companies, having access to massive amounts of data generated by their users, and to the pertaining computing power required to process them. Novel approaches are still required, however, to help individuals make sense of their own personal data. These data are often too sparse for pure bottom-up approaches, and too heterogeneous for pure top-down approaches. They are too complex for manual processing (even with efficient user interfaces), but also too complex for fully automated reasoning. Again, the proposal in Chapter 6 can provide the basis of a unifying framework for such hybrid approaches.

7.3. Re-decentralizing the Web

Regardless of the approach, users willing to analyze their personal data face another problem: a large part of these data is locked in the databases of the services they use, and they only have a limited access to them, if any. What is worse, users have very little control or knowledge on who has access to their personal data.

As stressed by Hall, Hendler, and Staab (2016), the Web has become so influent in our societies that it can no more be considered only as a technical system. “Code becomes law, but the law should not be imposed by the few without the control, or at least the knowledge, of the many.” Any technical decision must be considered with its social and ethical implications. We must therefore favor architectures empowering users. In Chapter 4, we have shown how the principles of Linked Data can help achieve this goal. Projects such as SOLID (Mansour et al. 2016) and Hydra (Lanthaler and Gütl 2013) allow to create Web services which do not deprive users from control over their personal data, and which can connect with each other on the basis of users needs and intents, rather than on a pre-defined application-driven basis. All the future works identified in this conclusion must keep in line with this trend, and contribute to it.

7.4. Conclusion

When starting to write a habilitation dissertation, it is at first difficult to find a consistent way to present one’s work. It looks very much as a patchwork, each project involving different people, happening in different contexts, and led much more by opportunities and serendipity than by a continuous attempt to hold one’s course. Is there any consistency there to find anyway? One then has to adopt a different perspective on the work done, focusing on the only common trait in all this diversity: oneself. Then one can try and describe not only what they have done, but why they did it that way, and what remains to be done.

Writing this dissertation has been a long undertaking. It was nevertheless a rewarding experience: first by helping me realize that I was indeed following a more consistent research direction that I would have thought; and second by giving me an opportunity to rediscover my own works (and others’) in the light of this research direction. I hope that the reader will have found as much interest in reading these pages as I have found in writing them.

The question of meaning

`I don’t know what you mean by “glory,”’ Alice said.

Humpty Dumpty smiled contemptuously. `Of course you don’t – till I tell you. I meant “there’s a nice knock-down argument for you!”’

`But “glory” doesn’t mean “a nice knock-down argument,”’ Alice objected.

`When I use a word,’ Humpty Dumpty said in rather a scornful tone, `it means just what I choose it to mean – neither more nor less.’

`The question is,’ said Alice, `whether you can make words mean so many different things.’

`The question is,’ said Humpty Dumpty, `which is to be master – that’s all.’

– Lewis Carroll (1871)

As I tried to demonstrate along those chapters, what relates our works on interaction traces, on video annotations and on the Web of linked data, is the ambition to build intelligent systems leaving the field of possible interpretations as open as possible. That way, the system can adapt to the user, rather than forcing the user to adapt to it. The ability for users to define their own transformations in MTMSs, or their own description schemas and views in Advene, enables them to elicit and leverage their own interpretations.

It does not mean, however, that all possible interpretations are equally valid; meaning can not be arbitrarily decreed (despite Humpty Dumpty’s attempt, see the excerpt opposite), it has to be co-constructed, negotiated. The notion of congruence, proposed in the last chapter, aims to provide a formal framework to account for this negotiation: an interpretation is acceptable to the extent that it explains the operations that are performed (or can be performed) with the system.

Semantics is therefore anchored in interactions, which brings us back to the importance of traces and TBR, of course, but also to the importance of design. In order to be acknowledged as intelligent, a system must first be intelligible. In Section 6.4, we have explored the potential impact of the proposed framework on different aspects of software design, including intelligibility of information presentation. By describing a formal link between interpretations, on the one hand, and presentations and transformations, on the other hand, (in other words, between what the system means, and what it does), we may contribute to find a balance in the “rationalistic/design” opposition identified by Winograd (2006).

Finally, the role of the Web in the design of modern intelligent systems can not be neglected. The Web is not a separate application domain, it is the backdrop to all user’s interactions with any system or artefact, even non-connected ones, even non-digital ones: people will search forums for help about an application, they will google an unknown word from a book, they will not be surprised to see a URL printed on the side of a bus, even if this “link” is not directly actionable. It can be argued that, nowadays, there is no such thing as a non-connected object.

The value of the Web is not only in the amount of information available to us, but also (and mostly) in the numerous links that interconnect this information. Every link puts a piece of information in the context of many others, allowing as many interpretations and re-interpretations, provided that we have the right tools to take advantage of this ambivalence. This is the kind of tools that Bush (1945) had in mind when he imagined the Memex. Instead, as Pariser (2011) warns us, many current tools collapse this multiplicity, providing each of us with a comfortable, personalized and unique perspective, each to his/her own. We must therefore make sure that the Web’s ambivalence is not reduced to a juxtaposition of idiosyncrasies, but remains an empowering network of intertwined trails.

Chapter bibliography

Bush, Vannevar. 1945. “As We May Think.” The Atlantic Monthly, July. http://www.liacs.nl/~fverbeek/courses/hci/memex-vbush.pdf.

Carroll, Lewis. 1871. Through the Looking-Glass. http://www.gutenberg.org/ebooks/12.

Cazenave-Lévêque, Raphaël. 2016. “Interoperability among Trace Formalisms.” Master’s Thesis, Université Lyon 1.

Hall, Wendy, Jim Hendler, and Steffen Staab. 2016. “Web Science Manifesto.” The Web Science Trust. November 22, 2016. http://www.webscience.org/manifesto/.

Johnson, George. 2016. “To Beat Go Champion, Google’s Program Needed a Human Army.” The New York Times, April 4, 2016. http://www.nytimes.com/2016/04/05/science/google-alphago-artificial-intelligence.html.

Lanthaler, Markus, and Christian Gütl. 2013. “Model Your Application Domain, Not Your JSON Structures.” In Proceedings of the 22Nd International Conference on World Wide Web, 1415–1420. WWW ’13 Companion. New York, NY, USA: ACM. https://doi.org/10.1145/2487788.2488184.

Lebo, Timothy, Satya Sahoo, and Deborah McGuinness. 2013. “PROV-O: The PROV Ontology.” W3C Recommendation. W3C. https://www.w3.org/TR/prov-o/.

Mansour, Essam, Andrei Vlad Sambra, Sandro Hawke, Maged Zereba, Sarven Capadisli, Abdurrahman Ghanem, Ashraf Aboulnaga, and Tim Berners-Lee. 2016. “A Demonstration of the Solid Platform for Social Web Applications.” In Proceedings of the 25th International Conference Companion on World Wide Web, 223–226. International World Wide Web Conferences Steering Committee. http://dl.acm.org/citation.cfm?id=2890529.

Pariser, Eli. 2011. Beware Online “Filter Bubbles.” TED Talk. https://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles.

Sadallah, Madjid, Olivier Aubert, and Yannick Prié. 2014. “CHM: An Annotation- and Component-Based Hypervideo Model for the Web.” Multimedia Tools and Applications 70 (2): 869–903. https://doi.org/10.1007/s11042-012-1177-y.

Sanderson, Robert, Paolo Ciccarese, and Benjamin Young. 2016. “Web Annotation Data Model.” W3C Candidate Recommendation. W3C. https://www.w3.org/TR/annotation-model/.

Snell, James, and Evan Prodromou. 2016. “Activity Streams 2.0.” W3C Candidate Recommendation. W3C. http://www.w3.org/TR/activitystreams-core/.

Steiner, Thomas, Rémi Ronfard, Pierre-Antoine Champin, Benoît Encelle, and Yannick Prié. 2015. “Curtains Up! Lights, Camera, Action! Documenting the Creation of Theater and Opera Productions with Linked Data and Web Technologies.” In . Rotterdam, NL. https://hal.inria.fr/hal-01159826/document.

Troncy, Raphaël, Erik Mannens, Silvia Pfeiffer, and Davy Van Deursen. 2012. “Media Fragments URI 1.0 (Basic).” W3C Recommendation. W3C. http://www.w3.org/TR/media-frags/.

Williams, Hannah. 2016. “Google’s DeepMind AI Masters Lip-Reading.” Computer Business Review (blog). November 25, 2016. http://www.cbronline.com/news/internet-of-things/googles-deepmind-ai-masters-lip-reading/.

Winograd, Terry. 2006. “Shifting Viewpoints: Artificial Intelligence and Human–Computer Interaction.” Artificial Intelligence, Special Review Issue, 170 (18): 1256–58. https://doi.org/10.1016/j.artint.2006.10.011.

Wu, Garrett. 2013. “Why More Data and Simple Algorithms Beat Complex Analytics Models.” Data Informed (blog). August 7, 2013. http://data-informed.com/why-more-data-and-simple-algorithms-beat-complex-analytics-models/.