Data transfer and dataset request
We will tell you how to transfer data files and how to request CMS dataset transfer.
We explained the gsiftp and xroot protocols in the description of the CMS Tier-3 Computing Environment. You can transfer the files to KISTI Tier-3 at any time.
1. Data Transfer
When you know the exact location of the data file
If you know the exact location of the data file, you can process the data transfer very quickly and easily. You can transfer data to the following commands:
xrdcp root://[source of data file]//[Path of source data file [Dest]
xrdcp root://eoscms.cern.ch//store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000/CE860B10-5D76-E711-BCA8-FA163EAA761A.root .
## xrdcp는 -r (하위 디렉토리 포함) 기능이 지원됩니다.
xrdcp root://eoscms.cern.ch//store/mc/SAM/GenericTTbar/AODSIM/CMSSW_9_2_6_91X_mcRun1_realistic_v2-v1/00000 .
The receive and send destination must be a local directory or an XRootD server.
You do not know the location of the data file
If you don't know the location of the data file, you can find the location of the data and transfer it, or use the CMS Any Data, Anytime, Anywhere (AAA) feature to transfer data regardless of the location of the data file.
Find Data Files
It is most convenient to use the dasgoclient or DAS home page to locate the data file.
dasgoclient --query="site file=/store/mc/RunIIFall17NanoAODv5/QCD_HT1000to1500_TuneCP5_13TeV-madgraph-pythia8/NANOAODSIM/PU2017_12Apr2018_Nano1June2019_102X_mc2017_realistic_v7-v1/60000/04323B9F-4B66-2D44-89DE-644E8C48246D.root"
T1_RU_JINR_Disk T1_US_FNAL_Buffer T1_US_FNAL_MSS T2_KR_KISTI T2_US_Purdue T2_US_Wisconsin T3_KR_KISTI
Now, we checked the name of the saved site and check the prefix of the site.
cat /cvmfs/cms.cern.ch/SITECONF/T2_US_Purdue/PhEDEx/storage.xml | grep "root://"
<lfn-to-pfn protocol="xrootd" destination-match=".*" path-match="/+store/(.*)" result="root://cmsxrootd.fnal.gov//store/$1"/> <lfn-to-pfn protocol="root" destination-match=".*" path-match="/+store/(.*)" result="root://xrootd.rcac.purdue.edu//store/$1"/> <pfn-to-lfn protocol="root" destination-match=".*" path-match="root://xrootd.rcac.purdue.edu//store/(.*)" result="/$1"/>
Above, xrootd protocol endpoint is
Let's transfer the file using the prefix.
xrdcp root://xrootd.rcac.purdue.edu//store/mc/RunIIFall17NanoAODv5/QCD_HT1000to1500_TuneCP5_13TeV-madgraph-pythia8/NANOAODSIM/PU2017_12Apr2018_Nano1June2019_102X_mc2017_realistic_v7-v1/60000/04323B9F-4B66-2D44-89DE-644E8C48246D.root .
Using Data Auto-Discovery (CMS AAA)
CMS AAA functionality allows you to transfer files without browsing the site. Transfer files with a fixed prefix (root://cmsxrootd.fnal.gov).
xrdcp root://cmsxrootd.fnal.gov//store/mc/RunIIFall17NanoAODv5/QCD_HT1000to1500_TuneCP5_13TeV-madgraph-pythia8/NANOAODSIM/PU2017_12Apr2018_Nano1June2019_102X_mc2017_realistic_v7-v1/60000/04323B9F-4B66-2D44-89DE-644E8C48246D.root .
2. Request to send dataset
(1) Rucio (For Public Dataset)
You must be assigned a space quota for T3_KR_KISTI to transfer data through the current lucio command. If you need a quota, please request by e-mail(mailto:cmst3-support@kisti.re.kr) with the following information.
CERN Computing ID
Requested quota capacity (up to 10TB applicable without additional proof)
Sites requesting quotas (T2_KR_KISTI or T3_KR_KISTI)
Enable the rucio command with the following command: (We also provide
rucioenv
instead.)
###CMSSW 디렉토리 내에서 cmsenv를 하였을 경우 작동 안될 가능성 있음.
source /cvmfs/cms.cern.ch/cmsset_default.sh
source /cvmfs/cms.cern.ch/rucio/setup.sh
export RUCIO_ACCOUNT=<CERN ID>
voms-proxy-init --voms cms
### or below command
rucioenv
Proceed with the request to send the dataset with the command below.
### If you have enough quota,
rucio add-rule cms:<DATASET> 1 T3_KR_KISTI
## If you do not have enough quota,
rucio add-rule --ask-approval cms:<DATASET> 1 T3_KR_KISTI
You can check your transfer request status using below command.
rucio list-rules --account geonmo
Output Format
[rucio ID as HexCode] [User] cms:[DataSet] [Status/ OK, REPL,STUCK] [Tier Center] [COPIES]
OK : Completed files
REPL : Transferring files
STUCK : File being retried because transfer failed
20e8a18cb7e04a75b8d8106baf4ece01 geonmo cms:/gluinoGMSB_M3000_ctau10000p0_TuneCP2_13TeV_pythia8/RunIISummer16NanoAODv7-PUMoriond17_Nano02Apr2020_102X_mcRun2_asymptotic_v8-v1/NANOAODSIM OK[9/0/0] T3_KR_KISTI 1
총 9개의 파일 중 9개 전송 완료 [OK:9, REPL: 0, STUCK:0]
(2) xrdDownload.py (for User Private Dataset)
Dataset created and published by an individual user cannot be sent through rucio directly yet. As a result, we provide scripts that make it easy to transfer each file. Please follow the procedure below to transfer the data.
hep-tools Git 저장소로부터 dataset2filelist.sh와 xrdDownload.py 를 본인의 PATH 경로로 다운로드 받습니다.
dataset2filelist.sh 를 이용하여 다운로드 받으려는 데이터셋들의 개별 파일 목록을 작성합니다.
/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/swertz-TopNanoAODv6-1-1_2018-0d1d4920f08f56d048ece029b873a2cc/USER
/TTTo2L2Nu_TuneCP5_13TeV-powheg-pythia8/palencia-TopNanoAODv6-1-1_2018-0d1d4920f08f56d048ece029b873a2cc/USER
/TTToHadronic_TuneCP5_13TeV-powheg-pythia8/swertz-TopNanoAODv6-1-1_2018-0d1d4920f08f56d048ece029b873a2cc/USER
[geonmo@ui20 migration]$ dataset2filelist.sh input_dataset.txt datalist.txt
dataset2filelist.sh <from dataset list file> <to data list result file>
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_597.root 615580493
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_24.root 615219932
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_17.root 615423490
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_387.root 616238957
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_538.root 615412910
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_32.root 615054868
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_535.root 615491181
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_405.root 615219946
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_622.root 615852354
/store/user/swertz/topNanoAOD/v6-1-1/2018/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/TopNanoAODv6-1-1_2018/200610_101606/0000/tree_147.root 615512903
xrdDownload.py 스크립트에 해당 파일리스트를 이용하여 전송을 합니다.
[geonmo@ui20 migration]$ xrdDownload.py -i datalist.txt -p 4
missing file
missing file
missing file
missing file
missing file
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0001/nano_1444.root
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0001/nano_1444.root
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0002/nano_2098.root
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0002/nano_2098.root
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_229.root
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_343.root
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_229.root
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_343.root
/store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_229.root is failed.
Total: 1565 / Success: 1 / Fail: 0
/store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_343.root is failed.
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0001/nano_1009.root
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0001/nano_1009.root
Total: 1565 / Success: 2 / Fail: 0
/store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0001/nano_1444.root is failed.
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_319.root
/store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0002/nano_2098.root is failed.
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_319.root
Total: 1565 / Success: 3 / Fail: 0
Source: root://cmsxrootd.fnal.gov//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_853.root
Destination: root://cms-xrdr.private.lo:2094//xrd//store/group/lpccoffea/coffeabeans/NanoAODv6/nano_2016/MET/NanoTuples-2016_Run2016B-17Jul2018_ver2-v1/191210_034740/0000/nano_853.root
Total: 1565 / Success: 4 / Fail: 0
-i <Data list file> : 데이터 파일 리스트를 지정합니다. 복사 과정 중에 이미 전송된 파일들의 경우 자동으로 빠지기 때문에 전체 리스트를 넣으면 됩니다.
-p <num of parallel> : 동시에 전송할 파일 개수를 지정합니다. 4개가 적당하며 6개 이상부터는 후반에 에러가 발생할 확률이 높습니다. 네트워크 상태에 따라 선택하면 됩니다.
3. 전송 요청한 데이터셋 삭제 (쿼터 회복)
rucio 로 전송한 데이터셋은 rucio delete-rule 명령어로 삭제가 가능합니다. 단, 다른 사용자가 같은 데이터셋을 신청한 상태라면 데이터셋 자체는 삭제되지 않습니다. 하지만, 쿼터는 복구되므로 필요한 쿼터를 확보할 수 있습니다.
[geonmo@ui20 ~]$ rucio list-rules --account geonmo
ID ACCOUNT SCOPE:NAME
STATE[OK/REPL/STUCK] RSE_EXPRESSION COPIES EXPIRES (UTC) CREATED (UTC)
-------------------------------- --------- ------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------- ---------------------- ---------------- -------- --------------- -------------------
2de8f9060ac94da38b898d9fb267d16f geonmo cms:/StealthSHH_2t4b_mStop-1200_mSo-100_TuneCP2_13TeV-madgraphMLM-pythia8/RunIIAutumn18NanoAODv7-Nano02Apr2020_102X_upgrade2018_realistic_v21-v1
/NANOAODSIM OK[8/0/0] T3_KR_KISTI 1 2020-08-12 01:30:18
[geonmo@ui20 ~]$ rucio delete-rule 2de8f9060ac94da38b898d9fb267d16f
4. rucio 사용량 및 쿼터 확인
rucio를 호라용하실 때 사용량 및 쿼터 용량 확인이 필요하신 경우가 있습니다. 아래와 같이 확인하시기 바랍니다.
(1) 쿼터 확인
[geonmo@ui20 ~]$ rucio list-account-limits geonmo
/cvmfs/cms.cern.ch/rucio/x86_64/slc7/py2/current/lib/python2.7/site-packages/paramiko/transport.py:33: CryptographyDeprecationWarning: Python 2 is no longer
supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.hazmat.backends import default_backend
+-------------+-----------+
| RSE | LIMIT |
|-------------+-----------|
| T2_KR_KISTI | 20.000 TB |
| T3_KR_KISTI | 20.000 TB |
+-------------+-----------+
+------------------+---------+
| RSE EXPRESSION | LIMIT |
|------------------+---------|
+------------------+---------+
(2) 사용량 확인
[geonmo@ui20 ~]$ rucio list-rse-usage --show-accounts T3_KR_KISTI
USAGE:
------
used: 16.726 TB
rse: T3_KR_KISTI
updated_at: 2021-09-08 02:20:32
source: expired
rse_id: 1983977ffc0e47ffb2b1c4d496a8297f
------
used: 4.157 TB
rse: T3_KR_KISTI
updated_at: 2021-09-08 02:20:07
source: obsolete
rse_id: 1983977ffc0e47ffb2b1c4d496a8297f
------
files: 78660
used: 109.455 TB
rse: T3_KR_KISTI
updated_at: 2021-09-02 06:46:12
source: rucio
per account:
------
account: sync_t3_kr_kisti used: 64.192 TB percentage: 58.65
------
account: jlee used: 12.457 TB percentage: 11.38
------
account: geonmo used: 7.077 TB percentage: 6.47
------
account: heewon used: 4.831 TB percentage: 4.41
------
account: yhoonlee used: 3.691 TB percentage: 3.37
------
account: jichoi used: 2.658 TB percentage: 2.43
------
account: soohwan used: 2.204 TB percentage: 2.01
------
account: mchoi used: 1.995 TB percentage: 1.82
------
account: yjeong used: 1.507 TB percentage: 1.38
------
account: haoh used: 128.632 GB percentage: 0.12
------
account: gylee used: 50.821 GB percentage: 0.05
------
account: dakang used: 10.063 GB percentage: 0.01
------
rse_id: 1983977ffc0e47ffb2b1c4d496a8297f
------
used: 15.442 GB
rse: T3_KR_KISTI
updated_at: 2021-09-02 15:00:13
source: unavailable
rse_id: 1983977ffc0e47ffb2b1c4d496a8297f
------
Last updated
Was this helpful?