feat: support running measurements in parallel by bassosimone · Pull Request #848 · ooni/probe-cli

bassosimone · 2022-08-02T11:58:59Z

This experimental diff stems from a discussion I had this morning
with @hellais, where this functionality seems helpful to him.

The diff itself breaks a bunch of places in the codebase, hence
it's not suitable for merging as is, but required more work, and
in particular, adapting tests and making sure we're not breaking
any of the supported use cases.

As an overview, the core of this diff consists of making sure
that the InputProcessor can run measurements in parallel.

In turn, this requires using a service for submitting and a service
for writing measurements back to disk. These two services should
be chained: saving to disk happens after submitting such that we
write on disk a measurement with the report ID we got after we
submitted the measurement to the backend.

There is a queue between the measurer and the submitter with a buffer
of four places. I have experimentally determined that a larger than
this number would cause the code to really dishonour the maximum
runtime. In general, measuring could be faster than submitting thus
creating a large queue would mean a very long endgame condition
where we're busy submitting the queued measurements.

FWIW, if we're moving on with this approach, it would be quite
interesting to submit a compressed batch of measurements.

There are also some open questions about how it's best to report
errors, given that now the code runs asynchronously. We'll see.

Checklist

I have read the contribution guidelines
reference issue for this pull request:
if you changed anything related how experiments work and you need to reflect these changes in the ooni/spec repository, please link to the related ooni/spec pull request:

Description

Please, insert here a more detailed description.

@hellais

This experimental diff stems from a discussion I had this morning with @hellais, where this functionality seems helpful to him. The diff itself breaks a bunch of places in the codebase, hence it's not suitable for merging as is, but required more work, and in particular, adapting tests and making sure we're not breaking any of the supported use cases. As an overview, the core of this diff consists of making sure that the InputProcessor can run measurements in parallel. In turn, this requires using a service for submitting and a service for writing measurements back to disk. These two services should be chained: saving to disk happens _after_ submitting such that we write on disk a measurement with the report ID we got after we submitted the measurement to the backend. There is a queue between the measurer and the submitter with a buffer of four places. I have experimentally determined that a larger than this number would cause the code to really dishonour the maximum runtime. In general, measuring could be faster than submitting thus creating a large queue would mean a very long endgame condition where we're busy submitting the queued measurements. FWIW, if we're moving on with this approach, it would be quite interesting to submit a compressed batch of measurements. There are also some open questions about how it's best to report errors, given that now the code runs asynchronously. We'll see.

It's important to first wait for the submitted to have finished consuming it's queue before stopping the saver queue, otherwise we will miss saving the last measurement to disk.

bassosimone · 2022-08-03T15:34:48Z

We saw this -race error:

==================
WARNING: DATA RACE
Write at 0x00c00002c190 by main goroutine:
  runtime.closechan()
      /usr/local/opt/go/libexec/src/runtime/chan.go:356 +0x0
  github.com/ooni/probe-cli/v3/internal/engine.(*asyncSaver).Stop()
      /Users/art/code/ooni/probe-cli/internal/engine/saver.go:96 +0x64
  github.com/ooni/probe-cli/v3/internal/engine.(*InputProcessor).run()
      /Users/art/code/ooni/probe-cli/internal/engine/inputprocessor.go:101 +0x3fb
  github.com/ooni/probe-cli/v3/internal/engine.(*InputProcessor).Run()
      /Users/art/code/ooni/probe-cli/internal/engine/inputprocessor.go:71 +0x44
  github.com/ooni/probe-cli/v3/internal/oonirun.(*Experiment).Run()
      /Users/art/code/ooni/probe-cli/internal/oonirun/experiment.go:115 +0x745
  main.mainSingleIteration()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/libminiooni.go:416 +0x1193
  main.MainWithConfiguration()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/libminiooni.go:296 +0x34e
  main.Main()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/libminiooni.go:158 +0x287
  main.main()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/main.go:20 +0x3c

Previous read at 0x00c00002c190 by goroutine 81:
  runtime.chansend()
      /usr/local/opt/go/libexec/src/runtime/chan.go:159 +0x0
  github.com/ooni/probe-cli/v3/internal/engine.(*asyncSaver).SaveMeasurement()
      /Users/art/code/ooni/probe-cli/internal/engine/saver.go:73 +0xe4
  github.com/ooni/probe-cli/v3/internal/engine.(*asyncSubmitter).run()
      /Users/art/code/ooni/probe-cli/internal/engine/submitter.go:119 +0x171
  github.com/ooni/probe-cli/v3/internal/engine.StartAsyncSubmitter.func1()
      /Users/art/code/ooni/probe-cli/internal/engine/submitter.go:85 +0x71

Goroutine 81 (finished) created at:
  github.com/ooni/probe-cli/v3/internal/engine.StartAsyncSubmitter()
      /Users/art/code/ooni/probe-cli/internal/engine/submitter.go:85 +0x309
  github.com/ooni/probe-cli/v3/internal/engine.(*InputProcessor).run()
      /Users/art/code/ooni/probe-cli/internal/engine/inputprocessor.go:85 +0xf9
  github.com/ooni/probe-cli/v3/internal/engine.(*InputProcessor).Run()
      /Users/art/code/ooni/probe-cli/internal/engine/inputprocessor.go:71 +0x44
  github.com/ooni/probe-cli/v3/internal/oonirun.(*Experiment).Run()
      /Users/art/code/ooni/probe-cli/internal/oonirun/experiment.go:115 +0x745
  main.mainSingleIteration()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/libminiooni.go:416 +0x1193
  main.MainWithConfiguration()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/libminiooni.go:296 +0x34e
  main.Main()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/libminiooni.go:158 +0x287
  main.main()
      /Users/art/code/ooni/probe-cli/internal/cmd/miniooni/main.go:20 +0x3c
==================

for a previous version of this code (b2f8b47). I think the way in which I am using atomicx here may be wrong!

bassosimone and others added 2 commits August 2, 2022 13:53

Address race condition in closing of the WaitGroups

dc8e9cf

It's important to first wait for the submitted to have finished consuming it's queue before stopping the saver queue, otherwise we will miss saving the last measurement to disk.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support running measurements in parallel#848

feat: support running measurements in parallel#848
bassosimone wants to merge 2 commits intomasterfrom
parallel

bassosimone commented Aug 2, 2022

Uh oh!

bassosimone commented Aug 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bassosimone commented Aug 2, 2022

Checklist

Description

Uh oh!

bassosimone commented Aug 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants