Finding the max temperature using Pig script
The input data set can be obtain here :
https://drive.google.com/file/d/0BwiqVGNpnBVIbDZ6Q1V1RThxYXc/edit?usp=sharing
My Hadoop path is : /usr/local/hadoop/
Hadoop user is : /home/hduser
Steps :
- Start Hadoop hduser@kaustuv-studio14:/home/kaustuv$ /usr/local/hadoop/bin/start-all.sh
- Copy & ensure input exist in HDFS (use -copyFromLocal command )
( This will list weather.txt file )
- Start Pig grunt shell in MapReduce mode hduser@kaustuv-studio14:/home/kaustuv$ pig
- Write the following Max temp pig script
A = load '/home/hduser/weather.txt' AS (f1: chararray);
B = foreach A generate SUBSTRING(f1, 4, 8) AS (year: chararray), SUBSTRING(f1, 38,43) AS (temp: chararray) ;
C = group B by $0;
Max_temp = foreach C generate group,
MAX(B.temp);
store Max_temp INTO 'MAX_Temp_Output' ;
Internally pig script is converted into MapReduce program we can check the progress of this MR program via web interfaces of namenode & job tracker also.
B = foreach A generate SUBSTRING(f1, 4, 8) AS (year: chararray), SUBSTRING(f1, 38,43) AS (temp: chararray) ;
C = group B by $0;
Max_temp = foreach C generate group,
MAX(B.temp);
store Max_temp INTO 'MAX_Temp_Output' ;
Internally pig script is converted into MapReduce program we can check the progress of this MR program via web interfaces of namenode & job tracker also.
Output will be stored under MAX_Temp_Output folder inside users home directory here '/user/hduser'.
- Output can be verified using 'cat' command
This will list
Warning: $HADOOP_HOME is deprecated.
1941 106.2
1942 183.9
1943 176.7
1944 156.2
1945 130.6
1946 152.3
1947 191.1
1948 175.9
1949 181.1
1950 208.8
1951 168.8
1952 122.6
1953 126.5
1954 232.3
1955 130.2
1956 114.6
1957 187.7
1958 184.5
1959 229.9
1960 204.7
1961 173.8
1962 130.8
1963 187.9
1964 144.3
1965 186.1
1966 155.9
1967 173.8
1968 93.8
1969 146.4
1970 181.4
1971 136.4
1972 128.5
1973 119.9
1974 203.2
1975 132.3
1976 157.8
1977 150.6
1978 140.0
1979 158.9
1980 119.3
1981 217.2
1982 141.2
1983 122.1
1984 154.2
1985 146.0
1986 187.9
1987 219.2
1988 164.0
1989 120.2
1990 118.0
1991 142.4
1992 149.3
1993 190.6
1994 157.4
1995 145.6
1996 107.2
1997 219.0
1998 125.2
1999 143.0
2000 195.0
2001 147.4
2002 180.2
2003 111.4
2004 168.8
2005 194.4
2006 153.8
2007 155.2
2008 140.0
No comments:
Post a Comment