dc.contributor.author | Pou Mulet, Bartomeu |
dc.contributor.author | Quiñones, Eduardo |
dc.contributor.author | Martín Muñoz, Mario |
dc.date.accessioned | 2021-06-04T07:49:56Z |
dc.date.available | 2021-06-04T07:49:56Z |
dc.date.issued | 2021-05 |
dc.identifier.citation | Pou Mulet, B.; Quiñones, E.; Martín Muñoz, M. Adaptive optics control with reinforcement learning: first steps. A: . Barcelona Supercomputing Center, 2021, p. 87-89. |
dc.identifier.uri | http://hdl.handle.net/2117/346628 |
dc.description.abstract | When planar wavefronts from distant stars traverse the
atmosphere, they become distorted due to the atmosphere’s inhomogeneous
temperature distribution. Adaptive Optics (AO)
is the field in charge of correcting those distortions allowing
high-quality observations of distant targets. The AO solution
is composed of three main components: a deformable mirror
(DM) that corrects the deformation in the wavefront, a
wavefront sensor (WFS) that allows characterising the current
turbulence in the wavefront and a real time controller (RTC)
that issues commands to, via the deformation of the DM,
correct the wavefront. Usually, the operations are performed
on closed-loop with stringent real-time requirements (in the
order of 103 104 actions per second). At each iteration, the
WFS observes the wavefront after being corrected by the DM
and the RTC issues the commands to correct for the evolution
of turbulence and previous uncorrected errors (Figure 1 left).
One of the primary sources of error for an AO control
algorithm is the temporal error. The delay between characterising
the turbulence with the WFS and setting the desired
commands in the DM creates the need that any successful
control approach must take into account past commands and
the probable evolution of the atmosphere in this gap of time.
To do that, the most common approach in AO are variants
of Linear Quadratic Gaussian (LQG) with Kalman filters with
one of its initial iterations presented in [1]. Usually, a linear
model of the system’s evolution is built with a set of parameters
that are usually fitted based on observations or on theoretical
assumptions, which limits the capability of the system to
correct the turbulence.
In this paper, we present a novel solution based on Reinforcement
Learning (RL), based on a reward signal to be
optimised, that does not need any previously built model (as
LQG) and is non-linear. RL has been already applied in the
domain of AO, however, it has been limited to WFS-less
systems (e.g. [2]) or, more recently, to control a very limited
number of actuators [3]. This work’s main practical objective
is to be applied in the 8.2 m Subaru telescope (located in
Hawaii), which includes thousands of actuators.
B. AO Control: Integrator with gain |
dc.format.extent | 3 p. |
dc.language | en |
dc.language.iso | eng |
dc.publisher | Barcelona Supercomputing Center |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
dc.subject.lcsh | High performance computing |
dc.subject.other | Reinforcement Learning |
dc.subject.other | Adaptive Optics |
dc.subject.other | Nonlinear Control |
dc.subject.other | Machine Learning |
dc.title | Adaptive optics control with reinforcement learning: first steps |
dc.type | Conference report |
dc.subject.lemac | Càlcul intensiu (Informàtica) |
dc.rights.access | Open Access |
local.citation.startingPage | 87 |
local.citation.endingPage | 89 |