P2P Streaming Spotify - Stefano Ferretti Dept. Computer Science & Engineering University of Bologna
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
P2P Streaming Spotify Stefano Ferretti Dept. Computer Science & Engineering University of Bologna Email: s.ferretti@unibo.it
2 Licence Slide credits: Stefano Ferretti Part of these slides were collected from different Web sources (since they were redundant slides, it is difficult to assert which are the real creators) Copyright © 2015, Stefano Ferretti, Università di Bologna, Italy (http://www.cs.unibo.it/sferrett/) This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 License (CC-BY-SA). To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
Media Streaming ● Media content distributed on net and played out while downloading – No need for a complete download to begin the playback ● Live media streaming – Media produced on the go (e.g. live concert, sport match) – Approximate synchronicity among stream production, download and playback ● Application contraints, e.g., conferencing requires lower delays than Internet TV ● On demand streaming – Content stored in a server and sent upon request – Features allowed such as pause, fast forward, fast rewind – Possible to download the stream at a rate higher than production/playback rate
P2P Media Streaming: Issues ● Bandwidth intensive applications ● Responsiveness requirements ● Churn ● Network capabilities change ● Which overlay topology?
P2P Media Streaming: Approaches ● Tree based Overlay – Push Content Delivery – Single o Multiple Tree ● Mesh-shaped overlay – Pull content Delivery (swarming) – Similar to Bittorrent, different approaches to select chunks
Single Tree Streaming ● Root: server ● Other nodes: peers ● Push based distribution ● Every node receives the stream from the parent and sends it to its child nodes ● Advantages – Simplicity – Low delay ● Drawbacks – Low resilience to churn – Load on internal nodes
Mesh based Streaming ● Content divided into chunks ● Nodes organized in a mesh ● Every node asks for specific chunks (pull based approach) – As in BitTorrent ● Advantages – Resilient to churn – Load balancing – Simple to implement ● Drawbacks – No bound on latencies
Multiple Tree based Streaming ● stream divided into sub- streams ● Each stream associated to a tree ● Each node is in all trees to receive the whole content ● Advantages – Resilient to churn – Load balancing ● Drawbacks – No simple algorithms
Example: SplitStream Basic idea Split the stream into K stripes (with MDC coding) For each stripe create a multicast tree such that the forest Contains interior-node-disjoint trees Respects nodes’ individual bandwidth constraints Approach Built on top of Pastry and Scribe (pub/sub)
MDC: Multiple Description coding Fragments a single media stream into M substreams (M ≥ 2 ) K packets are enough for decoding (K < M) Less than K packets can be used to approximate content Useful for multimedia (video, audio) but not for other data Cf) erasure coding (a Forward Error Correction) for large data file
SplitStream : Interior-node- disjoint tree Each node in a set of trees is interior node in at most one tree and leaf node in the other trees. Each substream is disseminated over a tree Each tree is a Scribe multicast tree S ID =0x… a ID =1x… d g ID =2x… b c e f h i
P2P Streaming Overlays
Some other papers Survey X. Zhang, H. Hassanein, “A survey of peer-to-peer live video streaming schemes – An algorithmic perspective”, Computer Networks, Volume 56, Issue 15, 15 October 2012, Pages 3548-3579, ISSN 1389- 1286, http://dx.doi.org/10.1016/j.comnet.2012.06.013 A recent interesting paper that uses gossip Matos, M.; Schiavoni, V.; Riviere, E.; Felber, P.; Oliveira, R., "LAYSTREAM: Composing standard gossip protocols for live video streaming," Peer-to-Peer Computing (P2P), 14-th IEEE International Conference on , vol., no., pp.1,10, 8-12 Sept. 2014 Simulation code: http://www.splay-project.org/laystream
BitTorrent Live
BitTorrent Live ● Ruckert, J.; Knierim, T.; Hausheer, D., "Clubbing with the peers: A measurement study of BitTorrent live," Peer-to- Peer Computing (P2P), 14-th IEEE International Conference on , vol., no., pp.1,10, 8-12 Sept. 2014 ● http://live.bittorrent.com/ – The system down for a while as the BTLive developers are preparing for a mobile version ● “Peer-to-Peer Live Streaming,” Patent 20 130 066 969, March, 2013. [Online]. Available: http://www.freepatentsonline.com/y2013/0066969.html ● IEEE P2P 2012 - Bram Cohen - Keynote speech https://www.youtube.com/watch?v=9_b6n6_yedQ
BitTorrent Live ● Hybrid streaming overlay that applies a number of different mechanisms at the same time – Use of a tracker for peer discovery ● (bitTorrent) – Use of sub-streams ● (multi-tree streaming)
BitTorrent Live (1) club (2) (3) ● three stages – (1) a push injection of video blocks from the streaming source into the so called clubs – (2) an in-club controlled flooding – (3) a push delivery to leaf peers outside the individual clubs
BitTorrent Live ● The source divides the video stream into substreams – Each substream belongs to one club ● Peers are divided into clubs – Assignment to clubs done by the tracker when a peer joins – A peer belongs to a fixed number of clubs in that it is an active contributor. In all other clubs the peer is a leaf node and solely acts as downloader
BTLive: Club ● Objective of a club is to spread video blocks of its respective substream as fast as possible to many nodes that can then help in the further distribution process
BTLive: Peer ● Download – Multiple in-club download connections with members belonging to the same club – Single out-club download connection within each of the other clubs ● Upload – Multiple upload connections to peers within the same club – Multiple upload connections to leaf nodes outside the club
BTLive: Inside the club ● Peers establish a number of upload and download connections inside their own clubs to help in fastly spreading blocks of the respective substreams ● Within the club a mesh topology is used – Peers receive blocks from any of the peers they have a download connection with – Push-based mechanism ● No need for asking contents ● Drawback: duplicate block transfers – Mitigated by use of notifications among neighbors of novel block arrivals
Spotify
Spotify ● Spotify is a peer-assisted on-demand music streaming service ● Large catalog of music (over 15 million tracks) ● Available in more than 32 countries (USA, Europe, Asia) ● Very popular (over 10 million users) ● ~ 20000 tracks added every day ● ~ 7 * 10⁸ playlists
Business Idea ● Lightweight peer-assisted on-demand streaming ● Fast – Only ~ 250ms playback latency on average ● Legal ● More convenient than piracy – Free version with ads – Premium with extra features and no ads
Spotify Tech Team ● Developed in Stockholm ● Team size > 100
The importance of being fast ● How important is speed? ● Increasing latency of Google searches by 100 to 400ms decreased usage by 0.2% to 0.6% ● The decreased usage persists ● Median playback latency in Spotify is 265 ms ● In many latency sensitive applications (e.g. MOGs) the maximum allowed threshold is ~200 ms
The forbidden word
Technical Overview ● Desktop clients on Linux, OS X and Windows ● Smartphone clients on Android, iOS, Palm, Symbian, Windows Phone ● Also a Web based application ● libspotify on Linux, OS X and Windows ● Sonos, Logitech, Onkyo, and Telia hardware players ● Mostly in C++, some Objective-C++ and Java
Technical Overview ● Proprietary protocol – Designed for on-demand streaming ● Only spotify can add tracks ● 96-320kbps audio streams ● Streaming from – Spotify servers – Peers – Content Delivery Network (http)
Everything is a link ● spotify: URI scheme ● spotify:track:6JEK0CvvjDjjMUBFoXShNZ#0:44 ● spotify:user:gkreitz:playlist:4W5L19AvhsGC3 U9xm6lQ9Q ● spotify:search:never+gonna+give+you+up ● New URI schemes not universally supported ● http://open.spotify.com/track/6JEK0CvvjDjjMU BFoXShNZ#0:44
Spotify Protocol ● (Almost) Everything over TCP ● (Almost) Everything encrypted ● Multiplex messages over a single TCP connection ● Persistent TCP connection to server while logged in
Architecture ● Spotify uses a hybrid content distribution method, combining: – a client-server access model – a P2P network of clients ● Main advantage: only ~ 8.8% of music data comes from the spotify servers! The rest is shared among the peers – although mobile devices do not participate to the P2P network ● Possible drawbacks: – playback latency (i.e., the time the user has to wait before the track starts playing) – complex design
Architecture
Load Balancing ● To balance the load among its servers, a peer randomly selects which server to connect to. ● Each server is responsible for a separate and independent P2P network of clients. – advantage: does not require to manage inconsistencies between the servers’ view of the P2P network – advantage: the architecture scales up nicely (at least in principle). If more users join Spotify and the servers get clogged, just add a new server (and a new P2P network) ● To keep the discussion simple, we assume there is only one server.
High-level overview ● Client connects to an Access Point (AP) at login ● AP handles authentication and encryption – Proprietary protocol ● AP demultiplexes requests, forwards to backend servers ● Gives redundancy and fault-tolerance
Storage
Storage System ● Two-tiered Storage System: – Production Storage + Master Storage
Production Storage ● Multiple data centers: each client selects its nearest – Stockholm (Sweden), London (UK), Ashburn (VA) ● A production storage is a cache with fast drive & lots of RAM ● Key-value storage with replication – Each content replicated on 3 servers – Use of Cassandra: an open source distributed DBMS ● Each datacenter has it own production storage – To ensure closeness to the user (latency wise) ● Maintains most popular tracks ● Cache miss → request to the Master Storage – Clients will experience longer latency
Master Storage ● Maintain all the tracks – Slower drives and access ● Works as a DHT but with some redundancy ● Shared among all datacenters ● Tracks are kept in several formats
Communcating with backend servers ● Most common backend protocol is HTTP ● Some services need to push information, e.g. playlist
Back End Software ● Comprised of many small services – Do one task and do it well – e.g., playlist system, user service, etc ● Python, C++, Java, Scala
P2P Goals ● Easier to scale ● Less servers ● Lass bandwidth ● Better uptime
Spotify P2P network ● Not a piracy network: all tracks are added by Spotify ● Used on all desktop client (no mobile) ● A track is downloaded from several peers
Spotify P2P network ● Spotify uses an unstructured P2P overlay topology. – network built and maintained by means of trackers (similar to BitTorrent) – no super nodes with special maintenance functions (as opposite to Skype) – no DHT to find peers/content (as opposite to eDonkey) – no routing: discovery messages get forwarded to other peers for one hop at most ● Advantages: – keeps the protocol simple – keeps the bandwidth overhead on clients down – reduces latency ● This is possible because Spotify can leverage on a centralized and fast backend (as opposite to the completely distributed P2P networks)
P2P Structure ● Edges are formed as needed ● Nodes have fixed maximum degree (60) ● Neighbor eviction (removal) by heuristic evaluation of utility ● A user only downloads data she needs ● P2P network becomes (weakly) clustered by interest ● Unaware of network architecture
Spotify compared to Bittorrent ● One (well, three) P2P overlay for all tracks (not per-torrent) ● Does not inform peers about downloaded blocks ● Downloads blocks in order ● Does not enforce fairness (such as tit-for- tat) ● Informs peers about urgency of request
Limit resource usage ● Cap number of neighbors ● Cap number of simultaneous uploads – TCP Congestion Control gives “fairness” between connections ● Cap cache size ● Mobile clients don’t participate in P2P
Sharing tracks ● A client cannot upload a track to its peers unless it has the whole track – advantage: greatly simplifies the protocol and keeps the overhead low ● clients do not have to communicate (to their peers or to the server) what parts of a track they have – drawback: reduces the number of peers a client can download a track from (i.e., slower downloads) ● tracks are small though (few MB each), so this has a limited effect
Caching ● Player caches tracks it has played ● Default policy is to use 10% of free space (capped at 10 GB) ● Caches are large (56% are over 5 GB) ● Least Recently Used policy for cache eviction ● Over 50% of data comes from local cache ● Cached files are served in P2P overlay
Caching ● Advantages – Reduces the chances that a client has to re- download already played tracks – Increases the chances that a client can get a track from the P2P net ● Lower overload on the Spotify server ● Drawback – Impact on the users' disk ● Caches are large compared to the typical track size
Music vs Movies ● Music ● Movies – Small (5 minutes, 5 – Large (2 hours, 1.5 MB) GB) – Many plays/session – High bit rate – Large catalog – Active users Main problem: peer Main problem: discovery download strategy
Finding Peers ● There are two ways a client can locate the peers: – ask the server (BitTorrent style) – ask the other peers (Gnutella style) ● Client uses both mechanisms for every track
Finding Peers (server) ● Server-side tracker (BitTorrent style) – Only remembers 20 peers per track ● Not all ● Clients do not report to the server the content of their caches! – Returns 10 (online) peers to client on query ● Advantages – Less resources on the server – Simplifies the implementation of the tracker ● Drawback – Only a fraction of the peers can be located through the tracker ● Not a big issue, since clients can ask the other peers
Finding Peers (P2P) ● Broadcast query in small (2 hops) neighborhood in overlay (Gnutella style) – Each client has a set of neighbours in the P2P network ● Peers the client has previously downloaded a track before or has previously uploaded a track from – When the client has to dowload a new track it looks if its neighbours have it – The peers can in turn forward this search request to their neighbours The process stops at distance 2 ● ● Each query has an unique id, to ignore duplicate queries
Neighbor Selection ● A client uploads to at most 4 peers at any given time – Helps Spotify behave nicely with concurrent application streams (e.g., browsing) ● Connections to peers do not get closed after a download/upload – Advantage: reduces time to discover new peers when a new track has to be played – Drawback: keeping the state required to maintain a large number of TCP connections to peers is expensive (in particular for home routers acting as stateful firewall and NAT devices)
Neighbor Selection ● To keep the overhead low, clients impose both a soft and a hard limit to the number of concurrent connections to peers (set to 50 and 60 respectively) – when the soft limit is reached, a client stops establishing new connections to other peers (though it still accepts new connections from other peers) – when the hard limit is reached, no new connections are either established or accepted
Neighbor Selection ● When the soft limit is reached, the client starts pruning its connections, leaving some space for new ones. ● To do so, the client computes an utility of each connected peer by considering, among the other factors: – the number of bytes sent (received) from the peer in the last 60 (respectively 10) minutes – the number of other peers the peer has helped discovering in the last 10 minutes ● Peers are sorted by their utility, and the peers with the least total scores are disconnected.
P2P NAT Traversal ● Asks to open ports via UPnP – Universal Plug and Play (UpnP): set of networking protocols that permits networked devices to seamlessly discover each other's presence on the network and establish functional network services for data sharing, communications, and entertainment ● Attempt connections in both directions ● High connection failure rate (65%) ● Room for improvement
Streaming a track ● A track can be simultaneously downloaded from the Spotify servers (CDN) and the P2P Network ● Request first piece from Spotify servers ● Meanwhile, search Peer-to-peer (P2P) for remainder ● Switch back and forth between Spotify servers and peers as needed ● Towards end of a track, start prefetching next one
Streaming a track ● Tracks are split in 16KB chunks ● If both CDN and P2P are used, the client never downloads from the Spotify CDN more than 15 seconds ahead of the current playback point ● To select the peers to request the chunks from, the client sorts them by their expected download times and greedily requests the most urgent chunk from the top peer – Expected download times are computed using the average download speed received from the peers – if a peer happens to be too slow, another peer is used ● Upload in the P2P net – Simultaneous upload to at most 4 peers, due to TCP congestion control – Serving clients/peers sort by priority and speed → serve top 4 peers
Streaming a track ● The client continuously monitors the playout buffer – (i.e., the portion of the song that has been downloaded so far but not already played) ● If the buffer becomes too low (< 3 seconds) the client enters an emergency mode, where: – it stops uploading to the other peers – it uses the Spotify CDN ● this helps in the case the client fails to find a reliable and fast set of peers to download the chunks from
When start playing ? ● Minimize latency while avoiding stutter – Stutter: interruptions during the playout ● TCP throughput varies – Sensitive to packet loss – Bandwidth over wireless mediums vary ● Heuristics
Play-out Delay ● Non-delay preserving: not drop any frames or slow down the play-out rate ● Markov chain modeling: – Considers the experienced throughput and the buffer size – 100 simulations of playback of the track while downloading ● Determine if buffer underrun (cluttering) occurs – If more than one underrun occurs → waits longer before playback
Users Playing a Track ● Around 61% of tracks are played in a predictable order (i.e., the previous track has finished, or the user has skipped to the next track) – playback latency can be reduced by predicting what is going to be played next. ● The remaining 39% are played in random order (e.g., the user suddenly changes album, or playlist) – predicting what the user is going to play next is too hard – Playback latency may be higher
Playing: Random Access ● When tracks are played in an unpredictable (random) order, fetching them just using the P2P network would negatively impact the playback delay. ● Why? – Searching for peers who can serve the track takes time (mostly because of multiple messages need to be exchanged with each peer) – Some peers may have poor upload bandwidth capacity (or may be busy uploading the track to some other client) – A new connection to a peer requires some time before start working at full rate (see next slides) – P2P connections are unreliable (e.g., may fail at any time)
Playing: Random Access ● How to solve the problem? ● Possible solution: use the fast Spotify Content Delivery Network (CDN) – Drawback: more weight on the Spotify CDN (higher monetary cost for Spotify. and possibly to its users too) ● Better solution: use the Spotify CDN asking for the first 15 seconds of the track only. – Advantage: this buys a lot of time the client can use to search the peer-to-peer network for peers who can serve the track. – Advantage: the Spotify CDN is used just to recover from a critical situation (in this case, when the user has started playing a random track)
Playing: Sequential Access ● When users listen to tracks in a predictable order (i.e., a playlist, or an album), the client has plenty of time to prefetch the next track before the current one finishes. ● Problem: – You don’t really know whether the user is actually going to listen to the next track or not. – If the user plays a random track instead of the predicted one, you end up having wasted bandwidth resources.
Playing: Sequential Access ● Solution: start prefetching the next track only when the previous track is about to finish, as Spotify has experimentally observed that: – When the current track has only 30 seconds left, the user is going to listen to the following one in 92% of the cases. – When 10 seconds are left, the percentage rises to 94% ● The final strategy is: – 30 seconds left: start searching for peers who can serve the next track – 10 seconds left: if no peers are found (critical scenario!), use the Spotify CDN
TCP Congestion Window ● RECAP : – TCP maintains a congestion window (cwnd) – cwnd is used to avoid network congestion – A TCP sender can never have more than cwnd un-acked bytes outstanding – Additive increase, multiplicative decrease – RESTART WINDOW (RW): is the size of the cwnd after a TCP restarts transmission after an idle period (timeout)
TCP Congestion Window ● RECAP : – RFC 5681 (TCP Congestion Control) says: ● « […] Therefore, a TCP SHOULD set cwnd to no more than RW before beginning transmission if the TCP has not sent data in an interval exceeding the retransmission timeout » ● → when you start sending data after a while, the transmission starts slow
TCP Congestion Window and Spotify ● Spotify traffic is bursty ● Initial burst is very latency-critical ● Want to avoid needless reduction of congestion window ● Configure kernels to not follow the RFC 5681 SHOULD (???) – Dangerous if there are multiple connections
Security in Spotify P2P Network ● Control access to participate ● Verify integrity of downloaded files ● Data transfered in P2P network is encrypted ● Usernames are not exposed in P2P network, all peers assigned pseudonym
Playlist ● Most complex service in Spotify ● Simultaneous writes with automatic conflict resolution ● Publish-subscribe system to clients ● Changes automatically versioned, transmits deltas ● Terabyte sizes of data
Evaluation (outdated) ● Collected measurements 23–29 March 2010 ● (Before Facebook integration, local files, ...)
Data Sources ● Mostly minor variations over time – Better P2P performance on weekends – P2P most effective at peak hours ● 8.8% from servers ● 35.8% from P2P ● 55.4% from caches
Latency and Stutter ● Median latency: 265 ms ● 75th percentile: 515 ms ● 90th percentile: 1047 ms ● Below 1% of playbacks had stutter occurrences
Finding peers ● Each mechanism by itself is fairly effective
Traffic ● Measured at socket layer ● Unused data means it was canceled/duplicate
Track Accesses ● There is no cost per track for users ● What does the usage pattern look like? ● How is that affected by caches and P2P? ● 60% of catalog was accessed ● 88% of track playbacks were within most popular 12% ● 79% of server requests were within the most popular 21%
You can also read