数据科学家是一个比计算机科学家懂更多统计学,比统计学家懂更多计算机科学的人。 – Josh Blumenstock
D:\word.txt 中有如下数据:
0101
00101
00101
111
import org.apache.spark.api.java.*; import org.apache.spark.SparkConf; import org.apache.spark.api.java.function.Function; public class TestSparkJava { public static void main(String[] args) { String logFile = "D:\\word.txt"; SparkConf conf = new SparkConf().setMaster("local").setAppName("Demo"); JavaSparkContext sc = new JavaSparkContext(conf); JavaRDD<String> logData = sc.textFile(logFile).cache(); long numAs = logData.filter(new Function<String, Boolean>() { public Boolean call(String s) { return s.contains("0"); } }).count(); long numBs = logData.filter(new Function<String, Boolean>() { public Boolean call(String s) { return s.contains("1"); } }).count(); System.out.println("Lines with 0: " + numAs + ", lines with 1: " + numBs); sc.stop(); } }
答案: A
;
版权声明:
本文为智客工坊「楠木大叔」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。