Github Stanfordsnr Puffer Statistics Data Analysis Scripts For Puffer

Bonisiwe Shabane

-Dec 29, 2025, 12:03 PM

github stanfordsnr puffer statistics data analysis scripts for puffer

Dumps and analyzes the raw data recorded by Puffer, a TV streaming website and Stanford research study. Anyone can download the daily data, build the analysis programs, and run the pipeline themselves. We encourage the public to replicate our results, posted every day on the Puffer website Results page along with the anonymized raw data. The Data Description webpage explains the format of these results, while this README details the analysis pipeline which produces them. To set up a machine for the analysis, see scripts/init_data_release_vm.sh. This installs the dependencies listed in scripts/deps.sh, creates a local directory for the results, and builds the analysis programs.

Note that scripts/deps.sh installs packages as sudo, so users may prefer to manage dependencies on their own. Dependencies marked as "private" in the script are not required for users. The analysis pipeline has been tested on Ubuntu 19.10 and 18.04 (the latter requires slight modifications; see scripts/init_data_release_vm.sh). Given a date, the pipeline outputs CSVs containing the day’s (anonymized) raw data, as well as stream and scheme statistics. Scheme statistics are calculated over the day as well as several time periods preceding it (week, two-week, month, and experiment duration). Puffer is a free and open-source live TV research study operated by Stanford University that uses machine learning to improve video streaming algorithms.

It is written mostly in the C++ programming language [1] and relies on WebSocket as a transmission layer. [2] The study allows users across the United States to watch seven over-the-air television stations broadcasting in the San Francisco Bay Area media market for free.[3] Puffer was presumed to be launched on January 18, 2019. It was initially led by Francis Yan, a Stanford computer science doctoral student, with Hudson Ayers and Sadjad Fouladi from Stanford, and Chenzhi Zhu from Tsinghua University. The project's facility advisors are professors Keith Winstein and Philip Levis.[4][5] The research study uses machine learning to improve video-streaming algorithms, such as those commonly used by services like YouTube, Netflix, and Twitch. The goal is to teach a computer to design new algorithms that reduce glitches and stalls in streaming video (especially over wireless networks and those with limited capacities, such as in rural areas), improve...

The service is limited. Only those in the U.S. can sign up, and only up to 500 users can watch Puffer at a time. In addition, the service only re-transmits free over-the-air television channels in the San Francisco Bay Area media market, specifically the following ones picked up by an antenna located on the Stanford campus: KTVU 2... The service is not compatible with Apple’s Safari browser or with iPhone and iPad devices because it relies on Media Source Extensions, which are not supported on those platforms. Along with our research paper, we are publishing anonymized data collected on Puffer for the research community to investigate.

As our experiments are ongoing, new data is collected each day. This data is posted daily to the Experiment Results page, which also contains all data collected since the experiment began in January 2019. On this page, we provide a brief description. Please see the README in the puffer-statistics repo for more details on the data analysis. A single day of data is several GB, so please download a small set of fake data first to determine if the full Puffer data is indeed what you need. We would also be grateful if you could download the data from our server only once.

If anything is not clear in the below data description, please don't hesitate to post a question in our Google Group. At a high level, each day's Puffer data comprises different "measurements" — each measurement contains a different set of time-series data collected on Puffer servers, and is dumped as a CSV file. The CSV files that are essential for analysis include video_sent_X.csv, video_acked_X.csv, and client_buffer_X.csv, where X represents the day when the data was collected. For example, "2019-11-04T11_2019-11-05T11" means the data was collected between 2019-11-04T11:00:00Z and 2019-11-05T11:00:00Z (UTC is the default time zone). In addition to these three CSVs, we also release video_size_X.csv and ssim_X.csv that are described below. A special field in many CSV files is expt_id.

This is a unique ID identifying information associated with a "scheme", or pair of ABR and congestion control algorithms such as Fugu/BBR. The expt_id can be used as a key to retrieve the associated settings (e.g. algorithms and git commit) in the logs/expt_settings file. Each day has its own logs/expt_settings, containing the settings of all schemes run on Puffer between January 2019 and that day (as well as later days, if the analysis was performed later). If an expt_id were missing in the file, it would suggest an out-of-date file. The csv_to_stream_stats program in the puffer-statistics repo provides a function to parse this file.

Note that the research paper also uses the term "experiment" to refer to a group of schemes, e.g. the "primary experiment", whereas the expt_id refers to a single scheme. Additionally, there are two terms that we will use in the description: "stream" and "session". When a Puffer client watches TV for the first time or reloads the player page, it starts a new "session", identified by session_id in the CSVs. When a client switches channels, it enters into a different "stream" but still remains in the same "session", which uses the same TCP connection. Each CSV contains an index field solely used to group streams.

Two datapoints are considered part of the same stream if and only if they share both session_id and index. The values of session_id and index are not meaningful otherwise. Puffer is a free and open-source live TV research study operated by Stanford University to improve video streaming algorithms. The study allows users across the United States to watch seven over-the-air television stations broadcasting in the San Francisco Bay Area media market for free.[1] Puffer was launched on January 18, 2019. It was initially led by Francis Yan, a Stanford computer science doctoral student, with Hudson Ayers and Sadjad Fouladi from Stanford, and Chenzhi Zhu from Tsinghua University.

The project's facility advisors are professors Keith Winstein and Philip Levis.[2] [3] The research study uses machine learning to improve video-streaming algorithms, such as those commonly used by services like YouTube, Netflix, and Twitch. The goal is to teach a computer to design new algorithms that reduce glitches and stalls in streaming video (especially over wireless networks and those with limited capacities, such as in rural areas), improve... The service is limited. Only those in the U.S. can sign up, and only up to 500 users can watch Puffer at a time. In addition, the service only re-transmits free over-the-air television channels in the San Francisco Bay Area media market, specifically the following ones picked up by an antenna located on the Stanford campus: KTVU 2...

KRON 4 (The CW/MyNetworkTV) was added in March 2024 after KPYX (then KBCW) changed its CW affiliate status. This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Puffer (research study)". Except where otherwise indicated, Everything.Explained.Today is © Copyright 2009-2025, A B Cryer, All Rights Reserved. Cookie policy. There was an error while loading.

Please reload this page. Puffer is a free live TV streaming website and a research study at Stanford using machine learning to improve video streaming Purely functional video codec, used for ExCamera and Salsify [beta] Guardian Agent: secure ssh-agent forwarding for Mosh and SSH Puffer is a free live TV streaming website and a research study at Stanford using machine learning to improve video streaming There was an error while loading.

Please reload this page. Stream live TV in your browser. There's no charge. You can watch U.S. TV stations affiliated with the NBC, CBS, ABC, PBS, Fox, and CW networks. Puffer works in the Chrome, Firefox, Edge, and Opera browsers, on a computer or an Android phone or tablet.

Puffer does not work on iPhones or iPads or in Safari. Puffer is a research project in the computer science department at Stanford University. Please find more details in the FAQ and our research paper (USENIX NSDI '20 Community Award, IRTF Applied Networking Research Prize '21). There was an error while loading. Please reload this page. Puffer is a Stanford University research study about using machine learning to improve video-streaming algorithms: the kind of algorithms used by services such as YouTube, Netflix, and Twitch.

We are trying to figure out how to teach a computer to design new algorithms that reduce glitches and stalls in streaming video (especially over wireless networks and those with limited capacity, such as... Watch TV on this website. The idea of this study is streaming TV channels to study participants over the Internet, and the Puffer website will automatically experiment with different algorithms that control the timing and quality of video sent... The more diverse the Internet connections that the study participants use, the better the system will be able to learn, and the more robust the resulting computer-generated algorithms. Yes. Well, technically you don't even have to watch.

We just need people to stream video over different kinds of Internet connections, so that the computer has some live traffic to learn from and experiment on. Visit the Sign up page to join. You must be within the United States to use Puffer. No, there is no charge to participate. Puffer is an academic project at Stanford University and is entirely non-profit.

Github Stanfordsnr Puffer Statistics Data Analysis Scripts For Puffer

People Also Search

Dumps And Analyzes The Raw Data Recorded By Puffer, A

Note That Scripts/deps.sh Installs Packages As Sudo, So Users May

It Is Written Mostly In The C++ Programming Language [1]

The Service Is Limited. Only Those In The U.S. Can

As Our Experiments Are Ongoing, New Data Is Collected Each