• 回答数

    2

  • 浏览数

    326

齐吃大乳
首页 > 期刊论文 > 姿态检测论文

2个回答 默认排序
  • 默认排序
  • 按时间排序

吃不饱的阿呜

已采纳

人体姿态识别的过程中我们首先需要进行关键点检测,我们需要生成高分辨率的heatmap,但是传统的特征提取网络如VGG网络会将我们的feature map分辨率降 的很低,损失了空间结构。我们知道VGG的结构是穿行结构,使用HRNet则是将VGG的穿行结构改变成了并行结构,将不同分辨率的feature map进行并联,下面我们看下HRNet系列吧。 应用领域: 人体姿态检测 方法:只选择高分辨率特征图 应用领域:人脸关键点检测 方法:利用所有分辨率的特征图,对低分辨率特征图上采样后与高分辨率特征图拼接,经过1*1卷积,softmax层生成分割预测图 应用领域:图像分类 方法:HRNet-Wx-C:4张不同分辨率特征图经过bottleneck层,通道数翻倍后,从高分辨率图依次经过strided convolution与低分辨率图进行元素加操作,在经过1*1卷积使通道翻倍(1024->2048),全局平均池化后送入分类器。 应用领域:目标检测 方法:HRNetV2p:将HRNetV2拼接后的特征图经过不同尺度的平均池化操作产生不同级别的特征表示,经过1*1的卷积后形成特征金字塔 参考: [1] 关于HRNet的简介 [2] [论文阅读]HRNetV1,HRNetV2,HRNetV2p

243 评论

blackiron.sh

参考: 姿态论文整理: 经典项目: 姿态识别和动作识别任务本质不一样,动作识别可以认为是人定位和动作分类任务,姿态识别可理解为关键点的检测和为关键点赋id任务(多人姿态识别和单人姿态识别任务) 由于受到收集数据设备的限制,目前大部分姿态数据都是收集公共视频数据截取得到,因此2D数据集相对来说容易获取,与之相比,3D数据集较难获取。2D数据集有室内场景和室外场景,而3D目前只有室内场景。 地址: 样本数:>= 30W 关节点个数:18 全身,多人,keypoints on 10W people 地址: 样本数:2K 关节点个数:14 全身,单人 LSP dataset to 10; 000 images of people performing gymnastics, athletics and parkour. 地址: 样本数:2W 关节点个数:9 全身,单人样本数:25K 全身,单人/多人,40K people,410 human activities 16个关键点:0 - r ankle, 1 - r knee, 2 - r hip,3 - l hip,4 - l knee, 5 - l ankle, 6 - l ankle, 7 - l ankle,8 - upper neck, 9 - head top,10 - r wrist,11 - r elbow, 12 - r shoulder, 13 - l shoulder,14 - l elbow, 15 - l wrist 无mask标注 In order to analyze the challenges for fine-grained human activity recognition, we build on our recent publicly available \MPI Human Pose" dataset [2]. The dataset was collected from YouTube videos using an established two-level hierarchy of over 800 every day human activities. The activities at the first level of the hierarchy correspond to thematic categories, such as ”Home repair", “Occupation", “Music playing", etc., while the activities at the second level correspond to individual activities, . ”Painting inside the house", “Hairstylist" and ”Playing woodwind". In total the dataset contains 20 categories and 410 individual activities covering a wider variety of activities than other datasets, while its systematic data collection aims for a fair activity coverage. Overall the dataset contains 24; 920 video snippets and each snippet is at least 41 frames long. Altogether the dataset contains over a 1M frames. Each video snippet has a key frame containing at least one person with a sufficient portion of the body visible and annotated body joints. There are 40; 522 annotated people in total. In addition, for a subset of key frames richer labels are available, including full 3D torso and head orientation and occlusion labels for joints and body parts. 为了分析细粒度人类活动识别的挑战,我们建立了我们最近公开发布的\ MPI Human Pose“数据集[2]。数据集是从YouTube视频中收集的,使用的是每天800多个已建立的两级层次结构人类活动。层次结构的第一级活动对应于主题类别,例如“家庭维修”,“职业”,“音乐播放”等,而第二级的活动对应于个人活动,例如“在屋内绘画”,“发型师”和“播放木管乐器”。总的来说,数据集包含20个类别和410个个人活动,涵盖比其他数据集更广泛的活动,而其系统数据收集旨在实现公平的活动覆盖。数据集包含24; 920个视频片段,每个片段长度至少为41帧。整个数据集包含超过1M帧。每个视频片段都有一个关键帧,其中至少包含一个人体,其中有足够的身体可见部分和带注释的身体关节。总共有40个; 522个注释人。此外,对于关键帧的子集,可以使用更丰富的标签,包括全3D躯干和头部方向以及关节和身体部位的遮挡标签。 14个关键点:0 - r ankle, 1 - r knee, 2 - r hip,3 - l hip,4 - l knee, 5 - l ankle, 8 - upper neck, 9 - head top,10 - r wrist,11 - r elbow, 12 - r shoulder, 13 - l shoulder,14 - l elbow, 15 - l wrist 不带mask标注,带有head的bbox标注 PoseTrack is a large-scale benchmark for human pose estimation and tracking in image sequences. It provides a publicly available training and validation set as well as an evaluation server for benchmarking on a held-out test set (). PoseTrack是图像序列中人体姿态估计和跟踪的大规模基准。 它提供了一个公开的培训和验证集以及一个评估服务器,用于对保留的测试集()进行基准测试。 In the PoseTrack benchmark each person is labeled with a head bounding box and positions of the body joints. We omit annotations of people in dense crowds and in some cases also choose to skip annotating people in upright standing poses. This is done to focus annotation efforts on the relevant people in the scene. We include ignore regions to specify which people in the image where ignored duringannotation. 在PoseTrack基准测试中, 每个人都标有头部边界框和身体关节的位置 。 我们 在密集的人群中省略了人们的注释,并且在某些情况下还选择跳过以直立姿势对人进行注释。 这样做是为了将注释工作集中在场景中的相关人员上。 我们 包括忽略区域来指定图像中哪些人在注释期间被忽略。 Each sequence included in the PoseTrack benchmark correspond to about 5 seconds of video. The number of frames in each sequence might vary as different videos were recorded with different number of frames per second. For the **training** sequences we provide annotations for 30 consecutive frames centered in the middle of the sequence. For the **validation and test ** sequences we annotate 30 consecutive frames and in addition annotate every 4-th frame of the sequence. The rationale for that is to evaluate both smoothness of the estimated body trajectories as well as ability to generate consistent tracks over longer temporal span. Note, that even though we do not label every frame in the provided sequences we still expect the unlabeled frames to be useful for achieving better performance on the labeled frames. PoseTrack基准测试中包含的 每个序列对应于大约5秒的视频。 每个序列中的帧数可能会有所不同,因为不同的视频以每秒不同的帧数记录。 对于**训练**序列,我们 提供了以序列中间为中心的30个连续帧的注释 。 对于**验证和测试**序列,我们注释30个连续帧,并且另外注释序列的每第4个帧。 其基本原理是评估估计的身体轨迹的平滑度以及在较长的时间跨度上产生一致的轨迹的能力。 请注意,即使我们没有在提供的序列中标记每一帧,我们仍然期望未标记的帧对于在标记帧上实现更好的性能是有用的。 The PoseTrack 2018 submission file format is based on the Microsoft COCO dataset annotation format. We decided for this step to 1) maintain compatibility to a commonly used format and commonly used tools while 2) allowing for sufficient flexibility for the different challenges. These are the 2D tracking challenge, the 3D tracking challenge as well as the dense 2D tracking challenge. PoseTrack 2018提交文件格式基于Microsoft COCO数据集注释格式 。 我们决定这一步骤1)保持与常用格式和常用工具的兼容性,同时2)为不同的挑战提供足够的灵活性。 这些是2D跟踪挑战,3D跟踪挑战以及密集的2D跟踪挑战。 Furthermore, we require submissions in a zipped version of either one big .json file or one .json file per sequence to 1) be flexible . tools for each sequence (., easy visualization for a single sequence independent of others and 2) to avoid problems with file size and processing. 此外,我们要求在每个序列的一个大的.json文件或一个.json文件的压缩版本中提交1)灵活的. 每个序列的工具(例如,单个序列的简单可视化,独立于其他序列和2),以避免文件大小和处理的问题。 The MS COCO file format is a nested structure of dictionaries and lists. For evaluation, we only need a subsetof the standard fields, however a few additional fields are required for the evaluation protocol (., a confidence value for every estimated body landmark). In the following we describe the minimal, but required set of fields for a submission. Additional fields may be present, but are ignored by the evaluation script. MS COCO文件格式是字典和列表的嵌套结构。 为了评估,我们仅需要标准字段的子集,但是评估协议需要一些额外的字段(例如,每个估计的身体标志的置信度值)。 在下文中,我们描述了提交的最小但必需的字段集。 可能存在其他字段,但评估脚本会忽略这些字段。 At top level, each .json file stores a dictionary with three elements: * images * annotations * categories it is a list of described images in this file. The list must contain the information for all images referenced by a person description in the file. Each list element is a dictionary and must contain only two fields: `file_name` and `id` (unique int). The file name must refer to the original posetrack image as extracted from the test set, ., `images/test/023736_mpii_test/`. 它是此文件中描述的图像列表。 该列表必须包含文件中人员描述所引用的所有图像的信息。 每个列表元素都是一个字典,只能包含两个字段:`file_name`和`id`(unique int)。 文件名必须是指从测试集中提取的原始posetrack图像,例如`images / test / 023736_mpii_test / `。 This is another list of dictionaries. Each item of the list describes one detected person and is itself a dictionary. It must have at least the following fields: * `image_id` (int, an image with a corresponding id must be in `images`), * `track_id` (int, the track this person is performing; unique per frame),` * `keypoints` (list of floats, length three times number of estimated keypoints  in order x, y, ? for every point. The third value per keypoint is only there for COCO format consistency and not used.), * `scores` (list of float, length number of estimated keypoints; each value between 0. and 1. providing a prediction confidence for each keypoint), 这是另一个词典列表。 列表中的每个项目描述一个检测到的人并且本身是字典。 它必须至少包含以下字段: *`image_id`(int,具有相应id的图像必须在`images`中), *`track_id`(int,此人正在执行的追踪;每帧唯一), `*`keypoints`(浮点数列表, 长度是每个点x,y,?的估计关键点数量的三倍 。每个关键点的第三个值仅用于COCO格式的一致性而未使用。), *`得分`(浮点列表,估计关键点的长度数;每个值介于0和1之间,为每个关键点提供预测置信度), 数据集有360万个3D人体姿势和相应的图像,共有11个实验者(6男5女,论文一般选取1,5,6,7,8作为train,9,11作为test),共有17个动作场景,诸如讨论、吃饭、运动、问候等动作。该数据由4个数字摄像机,1个时间传感器,10个运动摄像机捕获。 由Max Planck Institute for Informatics制作,详情可见Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision论文 论文地址: 1,单人姿态估计的重要论文 2014----Articulated Pose Estimation by a Graphical Model with ImageDependent Pairwise Relations 2014----DeepPose_Human Pose Estimation via Deep Neural Networks 2014----Joint Training of a Convolutional Network and a Graphical Model forHuman Pose Estimation 2014----Learning Human Pose Estimation Features with Convolutional Networks 2014----MoDeep_ A Deep Learning Framework Using Motion Features for HumanPose Estimation 2015----Efficient Object Localization Using Convolutional Networks 2015----Human Pose Estimation with Iterative Error 2015----Pose-based CNN Features for Action Recognition 2016----Advancing Hand Gesture Recognition with High Resolution ElectricalImpedance Tomography 2016----Chained Predictions Using Convolutional Neural Networks 2016----CPM----Convolutional Pose Machines 2016----CVPR-2016----End-to-End Learning of Deformable Mixture of Parts andDeep Convolutional Neural Networks for Human Pose Estimation 2016----Deep Learning of Local RGB-D Patches for 3D Object Detection and 6DPose Estimation 2016----PAFs----Realtime Multi-Person 2D Pose Estimation using PartAffinity Fields (openpose) 2016----Stacked hourglass----StackedHourglass Networks for Human Pose Estimation 2016----Structured Feature Learning for Pose Estimation 2017----Adversarial PoseNet_ A Structure-aware Convolutional Network forHuman pose estimation (alphapose) 2017----CVPR2017 oral----Realtime Multi-Person 2D Pose Estimation usingPart Affinity Fields 2017----Learning Feature Pyramids for Human Pose Estimation 2017----Multi-Context_Attention_for_Human_Pose_Estimation 2017----Self Adversarial Training for Human Pose Estimation 2,多人姿态估计的重要论文 2016----AssociativeEmbedding_End-to-End Learning for Joint Detection and Grouping 2016----DeepCut----Joint Subset Partition and Labeling for Multi PersonPose Estimation 2016----DeepCut----Joint Subset Partition and Labeling for Multi PersonPose Estimation_poster 2016----DeeperCut----DeeperCut A Deeper, Stronger, and Faster Multi-PersonPose Estimation Model 2017----G-RMI----Towards Accurate Multi-person Pose Estimation in the Wild 2017----RMPE_ Regional Multi-PersonPose Estimation 2018----Cascaded Pyramid Network for Multi-Person Pose Estimation “级联金字塔网络用于多人姿态估计” 2018----DensePose: Dense Human Pose Estimation in the Wild ”密集人体:野外人体姿势估计“(精读,DensePose有待于进一步研究) 2018---3D Human Pose Estimation in the Wild by Adversarial Learning “对抗性学习在野外的人体姿态估计”

311 评论

相关问答

  • 纺织品生态检测综述论文

    到网上花钱买 !也可以再网上东拼西凑些文字就可以 。

    奔向八年 4人参与回答 2023-12-07
  • 轨道静态检测论文

    作为大众重要的交通工具,城市轨道交通的安全管理工作显得尤为重要。我整理了轨道交通安全管理论文范文,欢迎阅读! 城市轨道交通运输安全管理探究 摘要:作为城市轨道交

    37856552ah 3人参与回答 2023-12-06
  • 论文查重状态开始检测

    写论文的学生会知道论文的完整形式是标题,论文的摘要应该说明论文的要点。自动生成的目录和导言(或前言)。接下来是论文的主要章节和结论,然后是参考文献和注释。很多人

    qiuqiuFreda 5人参与回答 2023-12-10
  • 姿态估计的论文题目

    参考: 姿态论文整理: 经典项目: 姿态识别和动作识别任务本质不一样,动作识别可以认为是人定位和动作分类任务,姿态识别可理解为关键点的检测和为关键点

    Jingelababy今 3人参与回答 2023-12-09
  • 姿态检测论文

    人体姿态识别的过程中我们首先需要进行关键点检测,我们需要生成高分辨率的heatmap,但是传统的特征提取网络如VGG网络会将我们的feature map分辨率降

    齐吃大乳 2人参与回答 2023-12-11