A General Introduction to Web 2.0 Technologies and Applications Presented by: Prof Mark Baker ACET, University of Reading Tel: +44 118 378 8615 E-mail: [email protected] Web: http://acet.rdg.ac.uk/~mab March 23, 09 [email protected] A General Thanks • Firstly, I would like to say a big THANK YOU to all the speakers that I have harassed over the last couple of months to participate in this workshop. • The event has talks on: – – – – – Web 2.0 technologies, Clouds, User/Usability, Application, Tutorials, which should help start people in some of these technology areas. March 23, 09 [email protected] Outline • • • • General Introduction, What is Web 2.0? Gartner Hype Curve… Web 2.0 Technologies: – – – – – – – Wikis, Blogs, RSS, Tagging, Social networking, Flickr, Slideshare, YouTube, Twitter, REST AJAX, iGoogle, google gadgets, Web Semantics, Twine, Security concerns, • Summary/Conclusions. March 23, 09 [email protected] General Introduction • Various technologies seem to appear in waves, some are taken up and are successful, and others die out quickly. • I have been working in the parallel, distributed computing and HPC arena for 20+ years. – Seen lots of interesting technologies come and go! • CORBA, Jini… etc… – Spent a lot of time work on grid technologies and e-Science. • However, the Web 2.0 area seems to have been one of those domains of interest that has taken off like a rocket! • Hence the keen interest with this workshop, and the Edinburgh eSI theme that is exploring – ”The Influence and Impact of Web 2.0 on e-Research Infrastructure, Applications and Users”… March 23, 09 [email protected] General Introduction • It would be easy to ask questions as to why we want to explore this area… but these are some reasons! March 23, 09 [email protected] What is Web 2.0? • Tim O'Reilly first coined the term back in 2004. – The terms became more significant after the O'Reilly Media Web 2.0 conference in 2004. • Tim O'Reilly said that “Web 2.0 is the business revolution in the computer industry caused by the move to the Internet as a platform, and an attempt to understand the rules for success on that new platform”… • Many of us back in those days really wondered exactly what Web 2.0 was…!? – At that stage we thought the Web 2.0 stack was fairly empty… but since those days the extent that people collaborate, communication, and the range of tools and technologies have rapidly changed. March 23, 09 [email protected] What is Web 2.0? • Another more compact! description from Tim O'Reilly… – Web 2.0 is the network as platform, spanning all connected devices; – Web 2.0 applications are those that make the most of the intrinsic advantages of that platform: • Delivering software as a continually-updated service that gets better the more people use it, • Consuming and remixing data from multiple sources, including individual users, while providing their own data and services in a form that allows remixing by others, • Creating network effects through an "architecture of participation," and going beyond the page metaphor of Web 1.0 to deliver rich user experiences. March 23, 09 [email protected] Web 2.0 • Web 2.0 has many aspects: – Business Models that survived and have promise for the future. – Approaches such as services instead of products, the Web as a platform, ... – Concepts such as folksonomies, syndication, participation, reputation, .... – Technologies such as AJAX, REST, Tags, Microformats, ... – And many others ... March 23, 09 [email protected] What is Web 2.0 ? • • • • A concept not a product. A way of thinking. A way of working – collaborative and social. About: – Sharing information with others, – Information coming to you, – Deciding how you receive and view the information. • All sorts of technologies but…. • Examples: – Blogs, RSS, Wikis, social bookmarking (e.g. Furl, Del.icio.us, Connotea) Flickr, Facebook, MySpace, web based forums, email discussion lists, YouTube, Second Life…… March 23, 09 [email protected] Gartner Hype Curve March 23, 09 [email protected] Gartner's 10 strategic technologies for 2009 • The "potential for significant impact on the enterprise in the next three years": 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Virtualization, Cloud computing, Servers (beyond blades), Web oriented architectures, Enterprise mashups, Specialised systems, Social software / networking, Unified communications, Business intelligence, Green IT. March 23, 09 [email protected] Web 2.0 Web 1.0 Web 2.0 DoubleClick Ofoto Akamai mp3.com Britannica Online personal web sites Evite Google AdSense Flickr BitTorrent Napster Wikipedia Blogging Upcoming.org and Events and Venues Database Domain name speculation Search engine optimisation Page views Cost per click Screen scraping Web Services Publishing Participation Content management systems Wikis Directories (taxonomy) Tagging ("folksonomy") Stickiness Syndication From Tim O’Reilly’s “What is Web 2.0”on O’ReillyNet, 9/30/2005; http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/w hat-is-web-20.html?page=1 March 23, 09 [email protected] Wikis • wiki-wiki – Hawaiian meaning quick. • First wiki was the WikiWikiWeb, Ward Cunningham 1995. • A collaborative web application that allows users to easily add and edit content. • Can be used for: – Developing documentation, – Project management: • History keeps a record of the changes and different versions of the documents. – Developing a conference programme. • Encourages collaboration. • Many have blog like discussion areas and RSS feeds. March 23, 09 [email protected] Wikis • Relatively standardised format and layout “Makes our contributors concentrate on content rather than wasting time on pretty layouts”. • Default in most Wikis will let anyone create and edit a page: – Need to protect Admin functions and limit creation, edit and access rights, – Can “lock” individual pages or sections, – Can require registration to set up new pages or edit existing ones. March 23, 09 [email protected] Wikipedia Option to edit the page March 23, 09 [email protected] Wikipedia (2) No edit option March 23, 09 [email protected] Wikipedia - history Date of edits March 23, 09 Author/editor [email protected] What are wikis used for in real life? • Wikis for training materials and conference organising: – NeSC/eSI do this. • Wikis for compiling subject guides. – We create manuals/user-guides in our private Wiki, then have some PHP that lets us expose the content to the public. • Using a Wiki on an Intranet for internal purposes. March 23, 09 [email protected] Blogs • What is a Blog? – Short for web log, – Content management system that publishes information chronologically, – Content can range from self-indulgent drivel to extreme depth, – Easy to use and publish from anywhere, therefore there is a high proportion of utter rubbish in the “blogosphere”, – Blogs automatically generate RSS feeds. March 23, 09 [email protected] Anatomy of a Blog (2) Tags Archives List of recent posts March 23, 09 Blogroll of related blogs [email protected] Applications of Blogs • Instead of, or in addition to, a printed, emailed or static web-based newsletter: – Current awareness for staff, users, researchers and clients “What’s new”, – Publicising new services/products, encourage feedback via comments. • Marketing tool inside and outside of the organisation. • Recording professional development and reflective practice plus project development and discussions. • Comments or “suggestions” box. • Monitor blogs for information and competitor intelligence. • Alternative publishing medium. March 23, 09 [email protected] Blogs as sources of information • Blogs by industry gurus and experts are a good way of keeping up to date with what is happening in a particular sector. • Look for the Blogroll of List of Links on a relevant blog. • Google Blogsearch http://www.google.com/blogsearch – Uses advanced search to search within an individual blog. • Ask http://www.ask.com/ – Blogs and feeds. • Live Feeds search - http://search.live.com/feeds. • Blog search engines and directories: – http://www.technorati.com/ – http://www.blogpulse.com/ – http://www.quacktrack.com/ March 23, 09 [email protected] What is RSS? • Stands for Really Simple Syndication, or Rich Site Summary or RDF site summary. – Depends on version: • Rich Site Summary (RSS 0.9x), • RDF Site Summary (RSS 0.9 and 1.0), • Really Simple Syndication (RSS 2.x). – Also ATOM (Google). – Written in XML. – Look for the orange logos. • A means of delivering headlines, alerts, tables of contents. Regarded as the de facto standard March 23, 09 [email protected] Why RSS is not that popular? You need a feed “reader”… March 23, 09 [email protected] http://www.google.com/reader ….like Google Reader March 23, 09 [email protected] RSS instead of email • Reduces the overload in your email inbox. • By-passes spam filters. • Quicker and easier to scan and spot individual headlines within an alert or newsletter and decide what is relevant. • Can set up filters to pick up stories that mention specific products, companies... • You control when you receive and read the feeds. • Easier to “unsubscribe”. March 23, 09 [email protected] Tagging on Del.icio.us March 23, 09 [email protected] Some Common Uses for Del.icio.us • Storing bookmarks online so they can be accessed from the Internet. • Consolidating bookmark collections to eliminate the confusion of attempting to locate bookmarks stored on multiple computers. • Personal interests – shopping, vacations, hobbies, and so on. • Academic Pursuits – keeping track of online source materials in one location. • Sharing – Bookmarks via the public. • Expertise Mining – all bookmarks on del.icio.us have been chosen by a human being. – Exploring the results of their previous searches is a great labour saver March 23, 09 [email protected] March 23, 09 [email protected] Facebook Facts • Not just for College students anymore. • Anyone with a valid e-mail address can join… • Over 175 million active (users who have returned to the site in the last 30 days). • Company has 700+ employees. • More than half of Facebook users are outside of college with the fastest growing demographic being those 30 years old and older. • Average user has 120 friends on the site . • More than 3 billion minutes are spent on Facebook each day (worldwide). http://www.facebook.com/press/info.php?factsheet (Feb/09) March 23, 09 [email protected] Facebook March 23, 09 [email protected] March 23, 09 [email protected] Flickr • • • • http://www.flickr.com/ Owned by Yahoo! Share photos with selected individuals or make public. Put photos of your library’s or organisation’s events on Flickr: – Promote your department, information centre, organisation, – Direct journalists to your “album” when they ask for photos to accompany articles about you, – Make sure you tag and describe them, – Organise into sets, – Decide on copyright and Creative Commons licenses. March 23, 09 [email protected] Flickr March 23, 09 [email protected] Slideshare, • Share presentations. • Include an accompanying commentary. • Keep private, share with selected people, or make public. • Slideshare does not keep animations and embedded links. • Slideshare - http://www.slideshare.net/ • Embed Slideshare in your blog, web site, Facebook profile, start page …….. March 23, 09 [email protected] Slideshare March 23, 09 [email protected] YouTube • http://www.youtube.com/ • Owned by Google. • Videos of varying content and quality: – – – – – News broadcasts, Various videos and corporate broadcasts, PR, advertising campaigns, Videos of events, new service launches, anything, The Queen has a YouTube channel! • http://www.youtube.com/user/TheRoyalChannel • Embed YouTube videos in your Blog, Facebook page, start page, web site etc. March 23, 09 [email protected] Twitter • http://www.twitter.com/ • Microblogging: – – – – – “tweets” are 140 characters, What are you doing? “follow” friends, Lots of plugins for your browser and desktop e.g. TwitKit, Send first 140 characters of your blog postings to Twitter using http://twitterfeed.com, – Add Twitter to your Facebook profile. • Search for friends and colleagues, and topics: – Twitterment, Tweet Scan etc. • Analyse a person’s tweets with Tweet Clouds: – http://www.tweetclouds.com/ March 23, 09 [email protected] Twitter March 23, 09 [email protected] Who is on Twitter? The BBC The Times 10 Downing Street March 23, 09 [email protected] Conference Twitter Streams • “Blogging conferences is so 20th century!” – Twitterers/tweeters abound at conferences, – The INSOURCE Conference Twitter Experiment http://www.rba.co.uk/wordpress/2008/02/11/theinsource-conference-twitter-experiment/ , – Can set up a Twitter event stream, – Delegates, conference chairs, moderators can all comment on and monitor the proceedings, – Send tweets to your blog using LoudTwitter: • Generates a chronological list of your tweets by day and with the oldest listed first, • Easier to read as a record of the event. March 23, 09 [email protected] Second Life March 23, 09 [email protected] What next? • Play and experiment. • You do not have to try everything. • Focus on what you think will make your work easier, more productive, more effective. • If it does not work or it takes longer to carry out a task without significant benefits, ditch it! • There is no law that says you have to use something just because it has a Web 2 .0 tag. March 23, 09 [email protected] What is AJAX ? • AJAX is the acronym for Asynchronous JavaScript and XML. • The purpose is to create more dynamic and responsive web pages • It is also about building web clients in a Service Oriented Architecture that can connect to any kind of server: J2EE, PHP, ASP.Net, Ruby on Rails, etc. • AJAX involves existing technology and standards: – JavaScript and XML • Pattern: Page view displayed in a web browser where it retrieves data or mark-up fragments from a service and refreshes just a part of the page. March 23, 09 [email protected] What is AJAX ? • AJAX is non-trivial, it requires deep and broad skills in web development .... but the benefits to be gained can be huge compared to classic web applications. • AJAX enables major improvements in responsiveness and performance of web applications, e.g. used at Yahoo! Mail, Google Maps, live.com, and others. • AJAX is NOT hype – it is very real and very useful for highly interactive applications. March 23, 09 [email protected] AJAX compared to classic Web UIs Browser Server In the typical web application, each request causes a complete refresh of the browser page March 23, 09 Browser service Server An Ajax application begins the same way. After the initial page loads, Javascript code retrieves additional data in the background and updates only specific sections of the page [email protected] What is REST ? • REST is the acronym for “Representational State Transfer“ – an architectural model for the Web! • Principles of REST: – – – – Resource centric approach, All relevant resources are addressable via URIs, Uniform access via HTTP – GET, POST, PUT, DELETE, Content type negotiation allows retrieving alternative representations from same URI, • REST style services: – Easy to access from code running in web browsers, any other client or servers - popular in the context of AJAX – Takes advantage of the Web caching infrastructure – Can serve multiple representations of the same resource • See http://www.ics.uci.edu/~fielding/pubs/dissertation/top. htm March 23, 09 [email protected] Tycho A Resource Discovery Framework and Messaging System for Distributed Applications http://acet.rdg.ac.uk/projects/tycho/ March 23, 09 [email protected] Tycho Architecture • Tycho consists of the following components: – Mediators that allow producers and consumers to discover each other and establish remote communications, – Consumers that typically Mediator subscribe to receive information or events from Registry producers, Core – Producers that gather and publish information for consumers. • There is an asynchronous messaging API. • In Tycho, producers and/or consumers (clients) can publish their existence in a Virtual Registry (VR). March 23, 09 Producer Registry Core WAN (HTTP) WAN (P2P) Consumer [email protected] LAN (Socket) Producer Consumer Tycho Design • Tycho is a based on a publish, subscribe and bind paradigm. • Design Philosophy: – We believed that the system should have an architecture similar to the Internet, where every node provides reliable core services, and the complexity is kept, as far as possible, to the edges: • The core services can be kept to the minimum, and endpoints can provide higher-level and more sophisticated services, that may fail, but will not cause the overall system to crash. – We have kept Tycho’s core small, simple and efficient, so that it has a minimal memory foot-print, is easy to install, and is capable of providing robust and reliable services. – More sophisticated services can then be built on this core and are provided via libraries and tools to applications. • Allows Tycho to be flexible and extensible so that it will be possible to incorporate additional features and functionality. March 23, 09 [email protected] iGoogle March 23, 09 [email protected] iGoogle • • • • iGoogle portal is a free Google service, Is a customisable web portal, Users can add “Gadgets” to the page, Customisations are saved to the user’s account and retrieved when logging in again. March 23, 09 [email protected] Google Gadgets March 23, 09 [email protected] Google Gadgets • Gadgets are small user interface components: – Could also be called portlets or widgets. • Example: eBay Search Plus Gadget. March 23, 09 [email protected] Gadgets are Dynamic Web Applications • Gadgets can be static, but then are of limited use. • Dynamic Gadgets are more common. • Three general approaches when making a dynamic gadget: – Time dynamic – the content changes over time, e.g. a news gadget, – User input dynamic – the content changes via a user interacting with the gadget (forms and links), • User preference dynamic – the user sets preferences that persist across user sessions (e.g. eBay). • Gadgets need not include a page header/footer, they focus on the specific application they surface. March 23, 09 [email protected] Gadgets are NOT hosted by Google • • • • Google Gadgets can be created by anyone. Gadget must be deployed on a public web server. Once deployed, anyone can use the Gadget. iGoogle supports a Gadget library to help users find Gadgets they may want to use. March 23, 09 [email protected] Google Gadgets are Web Pages • Google Gadgets are implemented behind public URLs. • Any public server that speaks HTTP and returns HTML can be a Gadget host: – – – – – Apache web server, PHP, Ruby on Rails, ASP .NET, Java Application Servers (Servlet Containers). • Important: Your web server must be exposed to the Internet! March 23, 09 [email protected] Approaches to Web Semantics • • • • Tagging, Statistics, Linguistics, Semantic Web: – – – – – RDF – Store data as “triples”, OWL – Define systems of concepts called “ontologies”, Sparql – Query data in RDF, SWRL – Define rules, GRDDL – Transform data to RDF. • Artificial Intelligence. March 23, 09 [email protected] A Mainstream Application of the Semantic Web… March 23, 09 [email protected] What is Twine? • Twine is a new service for managing and sharing information on the Web. • Works for content, knowledge, data, or any other kinds of information. • Designed for individuals and groups that need a better way to organise, search, share and keep track of their information. March 23, 09 [email protected] How Twine Works 1. Collect or author structured or unstructured information into Twine via email, the Web or the desktop. 2. Twine creates a knowledge web automatically: – – – Understands, tags and link information automatically, Automatically does further research for you on the Web, Organises information automatically. 3. Provides semantic search, discovery and interest tracking. 4. Helps you connect with other people and groups to grow and share knowledge webs around common interests. March 23, 09 [email protected] Security Issues • The Web Browser is now the Web 2.0 platform. • It needs cross application features with a solid security model. – For example - currently, one area that has not been solved in Web 2.0 is that the browser does not sandbox the various Web 2.0 components that you may need to use. – So for example if you are using mashups from various sources (google/amazon/yahoo) within the browser the JS from one component can interact with the JS of another component, play with your cookies and probably screw other browser hosted components! • Google search - Web 2.0 security issues - gave MANY hits! • Many area of Web 2.0 that are open security issues, probably AJAX is one of the biggest! March 23, 09 [email protected] Security • • AJAX is a hacker's dream come true. It offers an increased attack surface, • In general, if you want to secure AJAX applications you must do six things: – – – – Direct API access, Vulnerability to reverse engineering, Susceptibility to amplifying Web attacks, Vulnerability to offline attacks. 1. Perform authentication/authorisation checks on both Web pages and Web services, 2. Group code libraries by function, 3. Validate all input for your application, including HTTP headers, cookies, query string and POST data, 4. Verify data type, length and format, 5. Always use parameterised queries, 6. Always encode output appropriately. Source: Billy Hoffman runs HP Security Labs, author of Ajax Security (Addison-Wesley) March 23, 09 [email protected] Firebug - Great Debugging/Hacking Tool March 23, 09 [email protected] SQL Injection • SQL injection plays on a simple problem: – A Web page's input fields often fail to distinguish between innocent user data - information like names or dates - and malicious commands, – When a hacker's hidden instructions are entered into a Web site's input forms, the site may confuse them with user data and pull the commands into its SQL database, where they can become integrated into the database's code. – That lets the hacker access the site's data or add commands to the page so as to infect a visitor with malicious software, – A survey of major Web sites by the Web security firm White Hat Security found that 16% of sites were vulnerable to this tactic. March 23, 09 [email protected] Cross-Site Scripting • About 65% of the major sites surveyed by security analysts White Hat Security are vulnerable to an attack called crosssite scripting, which allows a disturbing upgrade to phishing attacks. • The typical phisher e-mails users a link that brings them to a fraudulent site, conning them into sharing credit card information or other sensitive data. • In a cross-site scripting attack, the link instead folds hidden command into a destination site's code. • That means even a legitimate page can be secretly tweaked so that when a user enters bank codes or other sensitive information, the data ends up in the hands of the phisher. • The threat of cross-site scripting is yet another reason to watch out for links in unfamiliar e-mails. March 23, 09 [email protected] Cross-Site Request Forgery • Cross-site request forgery, sometimes known as "sidejacking”, takes advantage of a vulnerability that is common to password-protected Web pages. • When a user logs in to a private site their identity is marked with a "cookie” - a temporary file downloaded to a user's browser. • But if that user can be tricked into visiting a malicious site, while still logged in to that password-protected page, the second site can secretly steal his or her cookies, and with them, the user's access to the first site's private information. March 23, 09 [email protected] Google Hacking • About two out of every three Web searches starts at Google. • So, it seems, do many attacks on Web sites. • "Google hacking" uses the search engine to probe the entire Web for sensitive information or hackable vulnerabilities in code. • Just by entering the right search string, for instance, hackers are sometimes able to find repositories of credit card information or social security numbers stored on the Web. • Recently, an attack seeming to originate in China used Google to probe the Web for sites vulnerable to a certain strain of SQL injection, targeting more than half a million pages and infecting them with malicious software. March 23, 09 [email protected] Forced Browsing • In some cases, "hacking" a Web site is as simple as changing a single digit in a Web address. • By shifting the characters in a page's address that refers to a name or date, a malicious user can sometimes gain access to pages they are not intended to see, a process security professionals call "forced browsing.” • In 2006, Phil Angelides, a Democratic contender in the California gubernatorial campaign, was accused of hacking rival Arnold Schwarzenegger's Web site and obtaining a confidential audio file. • But a source close to the Democratic campaign told News.com that Angelides' aides had merely tampered with a URL to find the file. March 23, 09 [email protected] Timing Attacks • As much as Web sites try to hide their inner workings from hackers, some pages reveal information in signs as subtle as how quickly they load. • Security researchers have shown that software that guesses random usernames on a Web application's login page sometimes reveals which usernames are valid even without a password - that is because a valid username causes the site to pause for a slightly shorter time than an incorrect username would. • In some cases, spammers can use that simple trick to collect thousands of valid e-mail addresses, which they then target! • In a 2005 issue of the hacker magazine 2600, another researcher revealed how to use timing analysis to determine the dealer's hand in an online blackjack gambling site. March 23, 09 [email protected] Captcha Breaking • One major challenge for security professionals is distinguishing humans from software "bots" on the Web. • In a webmail service, for instance, users are shown a "captcha," a distorted word or image, and asked to identify the text or picture. • The goal is to foil software designed to sign up for accounts for the purpose of churning out spam. • But in some cases, spammers have beaten the countermeasure by creating sites that enlist users to solve captchas by the hundreds in exchange for pornographic images. • Google's Gmail captcha was the latest victim of cybercriminals. • Because the site offers an audio function that reads captchas aloud for blind users, hackers were able to use speech-to-text software to defeat the test automatically. March 23, 09 [email protected] Distributed Denial Of Service • Sometimes a hacker's goal is not to steal information or infect users with malicious software but rather to a shut down a site altogether. • In those cases, cyber-criminals often employ distributed denial of service attacks (DDOS), a technique that floods a Web server with requests for information and overwhelms it. • Using botnets, armies of unsuspecting computers are hijacked with invisible software, cyber-criminals can vastly multiply the size of their attacks and also mask their origins. March 23, 09 [email protected] Conclusions • More and more people are using Web 2.0 technologies – the other speakers within this workshop will present and show how these technologies and ideas are helping their research. • Some people like the ideas related to Web 2.0, other feel they are not good! • There has been a lot of discussion on the Internet about Web 3! • Jim Hendler sees Web 3.0 as the “Semantic Web technologies integrated into, or powering, large-scale Web applications”. • From my own view point, Web 3.0, will probably be the integration of Web 2.0 and the Semantic Web. March 23, 09 [email protected] Web X Roadmap Connections between Information Nova Spivack CEO & Founder Radar Networks Intelligent Web Web 4.0 Web OS 2020 - 2030 Intelligent personal agents Web 3.0 Semantic Web Distributed Search SWRL OWL 2010 SPARQL Semantic Databases 2020 OpenID AJAX Semantic Search ATOM Widgets Social Web RSS Mashups P2P RDF Office 2.0 Javascript Flash SOAP XML 2000 - 2010 Weblogs Social Media Sharing Java The Web HTML SaaS Social Networking HTTP Directory Portals Wikis VR Keyword Search Lightweight Collaboration The PC BBS Gopher Websites 1990 - 2000 SQL MMO’s MacOS Groupware SGML Databases Windows File Servers Web 2.0 Web 1.0 The Internet FTP IRC Email PC Era 1980 - 1990 USENET PC’s File Systems March 23, 09 [email protected] Connections between people Research3.Org March 23, 09 [email protected] Forthcoming Events March 23, 09 [email protected] Meteorological Event at Reading March 23, 09 [email protected]
© Copyright 2024