OPPO has 7 papers selected and wins 8 challenges at CVPR 2022
OPPO has 7 papers selected and wins 8 challenges at CVPR 2022
Kamis, 23 Juni 2022 | 09:57
SHENZHEN, CHINA -
Media OutReach
- 23 June 2022 - The annual Computer Vision and Pattern Recognition
Conference (CVPR) came to an end in New Orleans today, with globally
leading technology company OPPO successfully having seven of its
submitted papers selected for the conference, putting it among the most
successful technology companies at the event. OPPO also placed in eight
of the widely watched competition events at the conference, taking home
three first place, one second place, and four third place prizes.
As deep learning technology has developed over the years, artificial
intelligence has shifted from perceptual intelligence to cognitive
intelligence. In addition to being able to 'see' or 'hear' like humans,
modern AI technology is now able to demonstrate a similar level of
cognitive ability to humans too. Multimodal fusion, 3D visual
intelligence technology, and automated machine learning are becoming key
research topics in the field of AI, and areas in which OPPO has
achieved several theoretical and technological breakthroughs of its own.
"In 2012, deep neural networks designed for image recognition tasks
re-energized the research and application of artificial intelligence.
Since then, AI technology has seen a decade of rapid development." said
Guo Yandong, Chief Scientist in Intelligent Perception at OPPO."OPPO
continues to promote artificial intelligence to accomplish complex
perceptual and cognitive behaviors. For example, AI can learn from
unlabeled massive data and realize downstream migration and reconstruct
3D information from several limited perspectives. We also empower AI
with higher cognitive abilities to understand and create beauty and
develop embodied AI with autonomous behavior. I'm delighted to see that
seven of our papers have been selected for this year's conference.
Building on this success, we will continue to explore both fundamental
AI and cutting-edge AI technology, as well as the commercial
applications that will enable us to bring the benefits of AI to more
people."
The Seven papers accepted by CVPR 2022 showcase OPPO's progress in creating humanizing AI
Seven papers submitted by OPPO for this year's CVPR were selected for
presentation at the conference. Their areas of research include
multimodal information interaction, 3D human body reconstruction,
personalized image aesthetics assessment, knowledge distillation, and
others.
Cross-modal technology is seen as the key to 'humanizing' artificial
intelligence. Different modal data have different characteristics. Text
information often features a high level of generality, whereas visual
image information contains a large amount of specific contextual
details. It is a great challenge to establish effective interaction for
multimodal data. OPPO researchers proposed a new CRIS framework based on
the CLIP model to enable AI to get a more fine-grained understanding of
the text and image modal data. The model can achieve an accurate match
of a piece of relevant visual information in an image after feeding the
complex text descriptions.
The biggest difference between human and artificial intelligence today
lies in multimodality. Human beings are able to easily understand
information in both words and images and draw associations between the
two types of information. AI on the other hand is currently unable to
move past the identification stage and finds it difficult to accurately
match information between different modes. The novel method proposed by
OPPO improves multimodal intelligence, which could potentially lead to
artificial intelligence being able to truly understand and interpret the
world through multiple forms of information such as language, hearing,
vision, and others, making the robot and digital assistants of sci-fi
movies become a reality.
3D human body reconstruction is another area in which the OPPO Research
Institute has made significant progress. At CVPR, OPPO demonstrated a
process for automatically generating digital avatars of humans with
clothing that behaves more naturally. The solution was achieved by
improving the NeRF dynamic character model method. By analyzing RGB
video of humans captured with a camera, the OPPO model can accurately
generate 3D, 1:1 dynamic models that include small details like logos or
fabric textures. Creating accurate 3D models of clothes has remained
one of the biggest challenges in the field of AI due to the difficulty
in observing the deformation of clothes depending on the posture of
those wearing them. This makes it difficult for AI to recognize and
distinguish the deformation in certain parts of clothing, for example, a
hemline. The new model effectively reduces the requirements needed to
perform 3D human body reconstruction, providing technical foundations
that can be applied to areas such as virtual dressing rooms for online
shopping, AI fitness instruction, and the creation of lifelike avatars
in VR/AR worlds.
AI image recognition has now reached a stage where it can accurately
identify a wide range of objects within an image. However, the next
challenge in this area is developing AI that is capable of interpreting
an image for its aesthetic value. The ability of AI to evaluate images
in terms of their perceived aesthetic quality is often strongly related
to the big data used in training the AI model. As a result, 'opinions'
provided by AI are often not to everyone's taste, and in many cases,
models have been shown to demonstrate clear biases. This has led to the
development of more refined data and models that take into account the
diverse preferences of different people.
In collaboration with Leida Li, a professor from Xidian University, the
OPPO Research Institute brought the solution to this problem, which is
the innovative Personalized Image Aesthetics Assessment (PIAA) model.
The model is the first to optimize AI aesthetics assessment by combining
users' subjective preferences with more generalized aesthetic values.
The algorithm can perform personalized image evaluations based on
preferences learned by studying user profiles. In the future, the model
will be used to create personalized experiences for users, not just
limited to the curation of photo albums, but also provide
recommendations on how to shoot the best photo and which content a user
might prefer.
OPPO has also chosen to make the PIAA model evaluation data set the open
source for developers, with a number of research institutions and
universities already expressing an interest in using the data to further
their own efforts in personalized AI aesthetic assessment.
Further to this, OPPO also proposed a multi-view 3D semantic plane
reconstruction solution capable of accurately analyzing surfaces within a
3D environment. The technology can recognize semantic characteristics
of different surfaces, such as the ground, desktops, and walls, to a
much higher degree of precision than current mainstream single-view
reconstruction architecture. Developed in partnership with Tsinghua
University, the INS-Conv (INcremental Sparse Convolution) can achieve
faster and more accurate online 3D semantic and instance segmentation.
This can effectively reduce the computing power needed to perform
environment recognition, which will enable such technology to be more
easily adopted in applications such as automated driving and VR.
OPPO makes AI 'lightweight' with second place win in the NAS Challenge
Alongside the presentation and review of the latest research in computer
vision and pattern recognition technology, CVPR 2022 also saw a number
of technical challenges take place, with OPPO placing third and above in
eight challenges. These include the neural architecture search (NAS)
challenge, SoccerNet, SoccerNet Replay Grounding, ActivityNet temporal
localization, the 4th Large-scale Video Object Segmentation
Challenge, the ACDC Challenge 2022 on semantic segmentation in adverse
visual conditions, and WAD Argoverse2 Motion Forecasting.
From mobile photography to automated driving, deep learning models are
being applied in an increasingly large pool of industries. However, deep
learning relies heavily on big data and calculation power and consumes a
lot of cost, both of which present challenges to its commercial
implementation. Neural architecture search (NAS) techniques can
automatically discover and implement optimal neural network
architectures, reducing the dependence on human experience and other
inputs to enable truly automated machine learning. In the NAS
competition, OPPO researchers trained a supernetwork of 45,000 sub
neural networks to inherit the parameters of the supernetwork by
optimizing the Model Parameter Forgetting and the Unfair Gradient
Descent Problem, achieving a high level of consistency between
subnetworks performance and performance ranking, ranking second place
among all entrants.
Using the NAS technique, researchers only need to train a large
supernetwork and create a predictor to let the subnetworks learn by
inheriting the supernetwork parameters. This provides an efficient and
low-cost approach to obtaining a deep learning model that outperforms
those manually designed by expert network architects. This technology
can be applied to most current artificial intelligence algorithms and
can help AI technology typically requiring a large amount of computing
power to be implemented on mobile devices by optimizing the neural
architecture search to search for architects that perform well under
specific conditions. This will ultimately bring previously unthinkable
levels of AI technology to mobile devices in the near future.
In addition to its success in the NAS challenge, OPPO also took first
place in the SoccerNet Replay Grounding challenge and third place in the
SoccerNet Action Spotting challenge, following second place wins in
both categories at CVPR last year.
During CPVR 2022, OPPO also participated in seminar presentations and
three high-level workshops. At the SLAM seminar, OPPO researcher Deng
Fan shared how real-time vSLAM could be run on smartphones and AR/VR
devices. OPPO researcher Li Yikang also delivered a speech at the mobile
artificial intelligence seminar and presented OPPO's method for
performing unsupervised cross-modal hashing between video and text.
Named CLIP4Hashing, this method presents an important approach to
performing cross-modal search on mobile devices. In AICITY Workshop, Li
Wei proposed a multi-view based motion localization system to identify
abnormal behavior of drivers while driving.
OPPO is bringing the benefits of AI to more people, sooner
This is the third year that OPPO has participated at CVPR. Over the past
three years, AI research has undergone a dramatic shift from the
development of specific applications like facial recognition, to more
fundamental technologies that have wider reaching implications.
OPPO's rising success at CVPR during these three years owes much to its
continued investment in AI technology. OPPO first began investing in AI
development in 2015, establishing R&D teams dedicated to language
& semantics, computer vision, and other disciplines. At the
beginning of 2020, the Institute of Intelligent Perception and
Interaction was established under the OPPO Research Institute to further
deepen OPPO's exploration of cutting-edge AI technologies. Today, OPPO
has more than 2,650 global patent applications in the field of AI,
covering computer vision, speech technology, natural language
processing, machine learning and more.
Guided by its brand proposition, 'Inspiration Ahead', OPPO is also
working with partners across the industry to take AI technology from the
laboratory into daily life. In December 2021, OPPO launched its first
self-developed dedicated imaging NPU, MariSilicon X. The NPU boasts both
powerful computing performance and high energy efficiency to enable
complex AI algorithms to be run at unprecedented speeds on mobile
devices, delivering superior video quality through advanced night video
and other image processing algorithms. OPPO's AI technology has also
been used to develop products and features such as the real-time spatial
AR generator CybeReal, OPPO Air Glass, Omoji, and more. Through these
technologies, OPPO is aiming to create more lifelike digital worlds that
combine virtual and reality to create all-new experiences for users.