The Architecture of the World Wide Web Min Song IS NJIT Internet Architecture Today’s Internet Thousands of networks Connected by legal agreements and commercial contracts Uses TCP/IP protocol Internet service providers (ISPs) Provide most individual users with access to the Internet Dialup connections Modems and conventional phone lines xDSL and cable modems provide broadband access Packet Switching Most modern Wide Area Network (WAN) protocols, including TCP/IP, X.25, and Frame Relay Packet switching is more efficient and robust for data that can withstand some delays in transmission, such as e-mail messages and Web pages. Circuit-switching: Normal telephone service is based on a circuit-switching technology a dedicated line is allocated for transmission between two parties. data must be transmitted quickly and must arrive in the same order in which it's sent. real-time data, such as live audio and video. Use of Packets Internet Protocols:TCP/IP Communications protocol suite Packet switched protocol Transmission Control Protocol (TCP) Internet Protocol (IP) No end-to-end connection is required Each message broken down into small pieces called packets Packets possibly routed to destination over different paths Breaks messages into packets Numbers packets in order Reorders packets at the destination Routes packets to the proper destination Domain Names Every computer connected to the Internet must have a unique IP address IP address format is xxx.xxx.xxx.xxx where xxx is a number between 0 and 255 How do we know that 207.46.245.222 is Microsoft? Domain Name Service(DNS) A database of Internet names DNS Servers convert Internet names to IP addresses Top level domains Ping: to test whether a particular host is reachable across an IP network. Tcpdump:to sniff network packets and make some statistical analysis out of those dumps The World Wide Web Collection of hyperlinked computer files on the Internet Client-server application Web servers Web browsers as clients WWW standards Hypertext markup language (HTML) Current standard for writing Web pages Implementation of SGML specifically for Web pages Tags in HTML instruct the client browser how to format and display the Web page content Hypertext transfer protocol (HTTP) Extensible markup language (XML) Protocol that establishes a connection between Web server and client A meta-markup language Gives meaning to the data enclosed within XML tags Static versus Dynamic Web Pages HTML and XML only display and exchange data No interactivity; no processing of data Scripting languages Provides basic interactivity Rollovers Crawling text JavaScript VBScript Full-featured Web programming Java Client side scripting or browser side scripting Applets J2EE Common Gateway Interface (CGI) Allows passing of data between a static HTML page and a computer program Searching the WWW Most data on the Internet is part of the WWW Search engines – large databases that index WWW content Building the search engine database Submit a site to the search engine administrator for listing Spiders Google Yahoo Metatags Hypertext Transfer Protocol A protocol (syntax and semantics) for transferring representations of resources usually across the Internet using TCP Design goals speed (stateless, cachable, few roundtrips) simplicity extensibility data (payload) independence A true network-based API HTTP/0.9 (pre-1993) Absolute Simplicity GET /url-path <TITLE>Hello World</TITLE> Hello World No Extensibility only one method (GET) no request modifiers no response metadata HTTP/1.0 (1993-present) Simple and (mostly) Extensible GET /Test/hello.html HTTP/1.0 Accept: text/html User-Agent: GET/5 libwww-perl/0.40 HTTP/1.0 200 OK Date: Fri, 12 Jan 1996 01:02:49 GMT Server: Apache/1.0.5 Content-type: text/html Content-length: 38 Last-modified: Wed, 10 Jan 1996 01: <TITLE>Hello</TITLE> Hello out there! HTTP/1.0 Deficiencies No complete specification until end of `94 No minimum standard for compliance Poor network behavior one request per connection no reliable transfer of dynamic content no control over response caching failed to anticipate proxies and gateways created huge demand for vanity addresses misuse/misunderstanding of MIME HTTP/1.1 Culmination of two years work, RFC2068 with Henrik Frystyk, Jim Gettys, Jeff Mogul designed at UCI and W3C; expanded in IETF Improved Reliability chunked transfer of dynamic content recognition of proxy and gateway requirements explicit cachability of responses Improved Network Behavior persistent connections virtual hosts (many names, one address) HTTP/1.1 (1997-????) Less Simple, More Extensible, but Compatible GET /Test/hello.html HTTP/1.1 Host: kiwi.ics.uci.edu:8080 User-Agent: GET/7 libwww-perl/5.40 HTTP/1.1 200 OK Date: Fri, 07 Jan 1997 15:40:09 GMT Server: Apache/1.2b6 Content-type: text/html Transfer-Encoding: chunked Etag: “a797cd-465af” Cache-control: max-age=3600 Vary: Accept-Language … HTTP/1.x Deficiencies MIME is too verbose (overhead per message) Control mixed with metadata Metadata restricted to header or trailer Fixed request/response ordering can block progress Incurs frequent round-trip delays due to connection establishment. HTTP/2.x Tokenized transfer of common fields reducing bandwidth usage, latency removal of MIME syntax limitations self-descriptive for extensions Multiplexing control, data, metadata streams reducing desire for multiple connections enabling multi-protocol connections per-stream priority or credit mechanism Layered streams for meta-metadata, encryption... XML to the rescue? “X” for extensible: self-descriptive syntax semantics by reference (doctype, namespaces) rendering by reference (style sheets) An XML representation is an object turned inside-out, with behavior-by-reference However, network application performance will demand standards for domain-specific doctypes and style sheets Future Work Dynamic application architectures Architectural analysis and performance bounds Impact of future network architectures (ATM) Balancing secure transfer with firewall visibility Protocol for manipulating resource mappings HTTP-NG (W3C/Xerox PARC) rHTTP (UCI)
© Copyright 2024