cf/cf.txt at master · macstevens/cf · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
                                  C   F L A T

    Here is my vision of computer programming.  In part, it is a reaction
against vendor lock-in, implementation hiding, advertised but unfulfilled
interoperability, etc. that I associate with Microsoft .NET and C Sharp.  As of
now (2012), I regard C as the most elegant, most easily understood, most
portable, and for many applications the most popular language above assembly.
Efforts to replace C altogether or build a higher-level language on top of C
have failed to match the success that C has achieved in replacing assembly.
Part of my vision is to build on top of C, including an editing tool that allows
a view of the source code such that any loop or conditional code sub-block is
treated as a function call.  In this view, every function is a flat sequence of
instructions.  Hence the name, C Flat.

    I don't claim that any of this is original.  Some is certainly not.  But
after working 11 years programming in C++ for Formfactor, Inc., I have not
seen these ideas practically integrated anywhere.  If any of this is original,
I view it as a part of the language / artificial intelligence work I claimed
when I began working there, and it is too general to be directly related to
Formfactor's business, so it is not the property of Formfactor.  Though only
in brainstormed note form, I am making this public now.  I would like C Flat to
proceed as an open source project.  Actually this is a series of projects, some
of which I would like to find an existing open source project to build from
rather than starting from scratch.  I will license it (e.g., GPL) if and when
that becomes appropriate.

Mac Stevens
12 August 2012

--------------------------------------------------------------------------------


Capabilities:
* Test that two different procedures produce the same result.  When they don't, analyze sub-procedures that should produce the same result, automatically narrowing down the area where they diverge.

* Test that two versions (DEBUG/RELEASE) of a program produce the same sequence of states -- zero in on divergence

* Test that two separate runs of same program produce the same sequence of states -- zero in on divergence

*Automatic Unit Test


Equivalent Functions
Equivalent Functions in different languages: English, C++, assembly, machine,
Drawings, diagrams, documentation -- explicitly related to other documentation and code
functions related to other functions-- cut, paste, & customize


auto-version control/delete queue

source code directory structure automatically managed
dependencies, equivalences, other relations -- automatically managed


Solidworks: Objects & Relations
- Require minimum input from user
- [Edit Parameters/Preview, then Submit] Pattern


Grammar/Sentence Structure/Relations/Sentence Diagrams
- noun + verb
- noun + verb + indirect object + direct object
- (noun+adjective) + verb


Example of setting a condition/restriction on a block of code:

CODE BLOCK:
void SPIClass::begin() {

pinMode(SCK, OUTPUT);
pinMode(MISO, INPUT);
pinMode(MOSI, OUTPUT);
pinMode(SS, OUTPUT); // 1 set SS pin as output mode

digitalWrite(SCK, LOW);
digitalWrite(MOSI, LOW);
digitalWrite(SS, HIGH);

SPCR |= _BV(MSTR);
SPCR |= _BV(SPE); // 2 enable SPI
}


Set relation:
[CODE SUB-BLOCK 1] <MUST_PRECEDE> [CODE SUB-BLOCK 2] <REASON> ["Important in this example is the setting of the SS pin as output pin. This has to be done before the SPI is enabled in master mode. Enabling the SPI while the SS pin is still configured as an input pin would cause the SPI to switch to slave mode immediately if a low level is applied to this pin. This pin is always configured as an input pin in slave mode (see Figure 2-7 on page 11)." - ]


More example relations:

comment:
[CODE SUB-BLOCK 1] <COMMENT> ["set SS pin as output mode"]


serial sequence:
[CODE SUB-BLOCK 1] [CODE SUB-BLOCK 2] [CODE SUB-BLOCK 3] ... [CODE SUB-BLOCK N]


auto-tuning
  Current C++ coding methodology requires fixing object sizes at compile time.  There is often a memory/performance trade-off determining number of bits used for an integer, and other pre-allocated objects.  Many other algorithm choices are pre-configured.  This could be tuned against the problem at hand.  A standard pre-configured dll could have the best default settings.  Several other pre-configured dll's could have other settings, easily swapped in.  Finally, if the runtime is too long, the auto-tuner could evaluate the slow spots and re-compile a special-purpose dll for the current problem.


semi-assertions, dynamic check enabling
  Some conditions are possible errors (warnings), some are always errors.  Some checks take too long to run to be enabled in every place they apply.  The programming/runtime environment should be able to disable and enable various checks to zero-in on problems when needed, and regulate the quantity and sequence of error messages presented to the user/programmer.


Competitive/Non-competitive,   Secret/Public,   for-profit/open-source
Programmers deserve compensation for their work.  Users should pay for benefit received.  True information, helpful programs, should be spread as broadly as possible.  A programmer, or engineer, or any worker, should not need the support of an entire corporation, with its inherent restrictions, to be compensated for his work.


Code Units, Compilation Units
  Currently code is stored in files.  Each file is compiled as a whole.  Instead, code should be managed by a database.  The function or smaller code block could be the basic unit.  These could be assembled into files and sent to the compiler.  For editing, similar grouping could be done.  Much unnecessary re-compilation is currently done for changes to comments or other dis-related sections of code.

Explicit temporary variables

Explicit sequence points

Explicit function calls
  -constructors
  -destructors
  -start-up functions
  -exception? ? goto
  -function sequence within boolean expression:
    if ((A() && B()) || C())
    -> A(&boolA)
       (boolA) ?
          G(bool_if) :
          H(bool_if);
          ...

explicit state
  - often, state is implicit, indicated by the current call stack, or current sequence point.  There should be a way to explicitly record the state in variables that need not be present in release builds.

Strict C-flat
  no global variables
  no local variables
  no temporary variables
  all functions void
  function = class
  no embedded loops
  loop body = function
  goto!


constify, unconstify -- make a variable a const, after initialization or just for a certain section


auto- code organization

multiple views of single line of code:
  expanded to multiple lines
  various graphical representations


auto-learning common bugs
  As each bug is fixed, however small, the programmer can categorize the problem.  Maybe the problem was simply that the function was never implemented, or whatever.  Then the IDE starts to learn that un-implemented functions sometimes cause bugs and can start to flag them as warnings when user is debugging and searching for the cause of the bug


GUI
Ptolemy project, EECS, UC Berkeley
Hugo Andrade, National Instruments, hugo.andrade@ni.com


                        C FLAT

C Flat, also known as Factored C, is an extension of C++.

No global variables.
No local variables.
Only class member variables.
No member functions.
Only global functions of the form void F(X).
No operator overloading, in fact, no operators at all
No virtual functions
No inheritance
No other subtle tricks
Every function and class is a template


class Point
{
    mX, mY
}

SetX(p, x)
{
    assert(p.type == Point);

}


Development Cycle
1. Make feature request/test/assertion.  Enable/disable assertion(s).
2. Run test(s).  Use software.
3. Notice error or illogical situation.  Record error.  Log bug.
4. Get more information.  Narrow area of investigation.  Go to step 1 or 5
5. Make change.  If situation matches some pattern, implement pattern solution.  Go to step 1 or 2.


Recognize duplicate or similar code.  When code changes, make assertion or suggestion that the corresponding similar code should change.


Not just pass/fail assertions.  Also, suggestions (of possible errors, or of things to try), warnings (possible errors), %certainty of error.  Capable of turning on or off assertions for a given time period or given situation.  Snooze button.


Automatically test the envelope of limits on input & output


Explicit object ownership
B b;
A *a = new A();
b owns a; // 'owns' is keyword

Compiler can detect memory leaks by un-owned pointers passing out of scope.

Garbage collection/memory defragmentation, etc. OK, but all explicit.


Minimal Assumptions/Occam's Razor
C flat has many layers of rules and restrictions.  At the bottom, anything goes -- no rules.  On top of this is a system that allows rules to be customized.  On top of this are more rules.  ... At the top is specific data.  The reason for this is that any particular language or programming style should not be outlawed unless there is a reason for prohibiting it.  There should be a way to clearly define freedoms and restrictions for the group and for the individual.


Preemptive Action
When the computer is not busy with assigned tasks, it does tasks it expects to be assigned -- all types of tasks


Handle Faulty Compiler
Sometimes the compiler gives a very unhelpful error message.  By doing a binary search, feeding the compiler different portions of the source code, this can be handled.  The exact portion of the code causing the compiler to complain and whether or not the compiler is not giving repeatable error messages, can be determined.


Alternate Functions
Multiple functions with equivalent or exactly equal results.  Computer can choose which to use, to optimize.  Also, functions or ways to get info on the efficiency (time & memory) of a function based on input data or no input data.


Alternative data representations.
File + zipped file + way to keep them in sync.
.obj files put together into various combinations of .dlls and static libs, so that minimal incremental re-compilation and linking is needed for the edit-compile-test cycle.


Firewalls/Watertight Chambers
Each program is divided into semi-independent sections that can be shut down, memory cleared, etc. with minimal effect on the rest of the program.  Each section may be sub-divided.  That way, when something crashes, the minimal amount of program has to be shut down.


Organizing Data In Blocks
Reason: parallel processing, split giant blocks of data into smaller blocks for processing by several processors
Ideas:
Each block has a revision number.  Bigger blocks may also have a time stamp giving when they were last updated.  Some blocks may have a lock.
Equivalent blocks: compressed version, indexed version
Comparison functions: compare blocks, merge blocks.
One block may contained a summarized version of many sub-blocks and pointers to those sub-blocks.


Prediction & Estimation
predict what major steps are coming up & and amount of time and CPU effort and memory and other resources are required.
When crashed and demanding user OK or specific choices, try to predict what future choices are to be made and whether user can approve of all at once or some group of choices at once.


No hidden functions & transformations.  In C++ "a=b;" might mean "a = ConvertBToA(b);"  All such transformations are viewable by expanding the current section of code into it's sub-parts.

No hidden parts of objects.  Virtual function table pointers, etc. are all viewable.


Convenience tends to kill modularity.  For example, deep in an algorithm, you want to pop up a message box with a warning.  So the algorithm now depends on the GUI.  To solve this, layers of interpreters could link the two together.  if (there is a GUI) then { pop open a message box:  Windows code: { .... }  Mac OSX code: { ... } default code: { ... }


method for naming and tracking strings and other sub-products through the system.  Sometimes, it's not clear where a message came from.  Need to trace back to the source for debugging.


reduce waste  (Ockham's Razor, Toyota Way)
  - waste during programming/debugging
  - waste in exchanging money beteen programmer and user
  - waste in reporting & fixing bugs
  - waste in documentation
  - waste in exchanging ideas  (non-disclosure agreements)


for every function/action/procedure, there are standard ways to define how much resources (time, memory, etc.) are estimated to be needed to complete it.


You have a variable or some structure.  Then you have all kinds of data about it.  Really, it's just a block of memory.  Then, there's its type, which is how it's supposed to be interpreted.  Then there's comments, assertions, invariants, etc.  There's data that's always true (it's type), then there's temporary data (is it initialized, is it de-allocated, etc.)  There are various standard functions and queries : copy, debug dump, graphical display, comparison, hash, save/load, etc.  The way all this data is managed should be standardized.  Similarly for functions/procedures.


Reducing waste.  Re-use existing languages and programs and libraries.  Make translators for any new language.  View the C-flat project as a series of small improvements, not a giant change.  Minimize requirements, minimize the difficulty of a new user, not for the purpose of trapping the user, but because it's the right way.


Right now in C++, there's no way to make an assertion that says, this condition is wrong/maybe wrong/always wrong/sometimes wrong, but continue anyways, and if I'm debugging, flag it, or otherwise take statistics on it.

There's no way to estimate how long a procedure takes, and to judge what procedure to do, based on this estimate.

There's no way to state that another function is done just to give info on another function, just for debugging.

There's no way to cleanly break, with options: break NOW, break and do minimal clean up, break and do normal clean up.  break and save where you are for re-running.


I need help tracking objects, when they were changed, etc.


I need a better way to deal with changing standards.  Set up some file format quick & dirty, then try to improve and you get stuck.  version number + time = complete version number.


c-flat / debugging file formats:
* http://www.ibm.com/developerworks/opensource/library/os-debugging/index.html?ca=drs-
* something based on CLIPS?


Gannt Chart
http://en.wikipedia.org/wiki/Gantt

Work Breakdown Structure
http://en.wikipedia.org/wiki/Work_breakdown_structure


Geir Isene WOIR chart


A way to market your code/function:  When programming, you want to search for a function, you search by input/output spec.  All the functions (in the world) are available, some free, some cost.  Some use more memory, but are faster.  Some are better tested, etc.  You can pick one as standard to put into your program, but later it can be swapped.  When you publish your code, and the user is finally using it, the user can also swap out the functions inside of yours, etc, by paying more money.


Every text string shown to the user, every icon, every pop-up, etc. can be considered a communication.  Each communication has a source, a chain of occurrences which led to it happening.  The user could potentially respond to each communication.  The user's response might be:
  - this is a bug  (I'll pay $.05) to have it fixed
  - this is a great feature
  - I don't understand
  - how do I get rid of this/ hide this/ silence this/ snooze this?

The response could be logged and sent back to the authors of the relevant code or the owners of that code


Find the basic concepts.  Find the basic ideas.  Organize ideas.  Be able to translate an idea from the Bible to the Koran to ... anyone's frame of reference.  Find common ideas and differing ideas.  Have a way to share ideas.  Keep track of the best lessons, the most effective presentations of an idea, the best definitions, slide presentations, movies, etc. that best communicate an idea.


Debugging:  need the ability to track changes to a portion of the data.  Treat objects or groups of objects as a database record.  Monitor when the record was changed to help determine when the state went invalid, or verify that it was never initialized, etc.


built into the language: time limits on a function.  After time A, start reporting progress.  After time B, exit with failure.  Time limit could be a function of input data.  If less than 2 seconds, proceed.  If more than 2 seconds, refine completion estimate.  If after 10 seconds, refine completion estimate, get approval for more time, etc.


Method for Debugging/Adding Feature.  In area where feature is missing, or bug exists, add redundancy, "scaffolding", "scar tissue", workarounds, debug diagnostics, etc.   Attempt to subdivide problem (new feature), or isolate problem (bug).  Repeat until problem area small enough to determine longer term solution.  Some redundancy may be necessary permanently.  A unit test is the extreme of this, where most of the code is "scaffolding".


Verify pointers still point to valid memory.


Debug Cycle
1. Enable more assertions.
2. Re-run.
3. Narrow trouble area. return to 1 or,
4. Fix root cause.
5. Store new unit tests
6. Re-run increasingly bigger tests


Object Hash -> detect change in sequence on runs that should have same results -> isolate code producing variable results.

View Data

Auto-Manage Assertions.


All programs/subroutines are like business organizations and need to: predict resources needed, subdivide tasks, report on progress, etc.


Instead of hiding the internal memory details, as does C# or Java, instead a compiler warning could be issued for unnecessarily accessing internal details.  A quality metric for the program could be a measurement of excessive dependencies, this being one of them.  Or, alternate equivalent functions could exist which don't depend on internal details, but are perhaps less efficient.  Also, assertions could exist to verify that assumptions of internal details are valid.  Assert(CPU == PowerPC);  Assert(memory model == ... );   if (memory model == A) { ... } else { ... }


Compare one run (bug) with another run (no bug) and compare, find similarities and differences to isolate where the bugged code went wrong.


c-flat language: no variables.  unambiguous way to label any value of any function call of any point in any call stack in the program, or in any program running on any computer, for that matter.
  i = i + 1;  // C language
  i_[343] = i_[342] + 1; // C-flat language
  i_[343][process300][computer 192.0.0.15] = i_[342][process300][computer 192.0.0.15] + 1; // C-flat language


Operator overloading in C++ obscures that actually a function is being called.  Function overloading (by different parameters, etc.) obscures that different functions are being called.


Network Linking, Redundant Linking
Instead of relying on a single link between objects or concepts, a network of redundant links and clues allow the links to be repaired later when information is lost.


Implied State, Made Explicit
Sometimes it helps to have an enumerated variable that indicates state.  There is always some state for each sequence point in a function/procedure.  For debugging, it can help to update a state variable which can be omitted or implied when not debugging.


Extra Variables, only for viewing and debugging.


http://clang.llvm.org/
http://llvm.org/

Abstract Syntax Tree  http://en.wikipedia.org/wiki/Abstract_syntax_tree
Parse Tree, Concrete Syntax Tree   http://en.wikipedia.org/wiki/Concrete_syntax_tree
http://en.wikipedia.org/wiki/Abstract_semantic_graph

Ontology  http://en.wikipedia.org/wiki/Ontology
http://en.wikipedia.org/wiki/Ontology_(computer_science)


Drilling down into the meaning or alternate representation of a
statement in a computer program is similar to defining the words and
symbols in any writing.

c-flat tools need not limit themselves to a particular language.
Ideally cf tools would work well for English.


-------WRITTEN NOTES-------

Programmer Expert System
  Expert System: CLIPS
  http://clipsrules.sourceforge.net/index.html
  http://sourceforge.net/projects/clipsrules/files/CLIPS/6.30/

Interpreter
Preprocessor
Compiler
Linker

Version Control

 +-----+   pass tests   +------+
 |     |--------------->|      |
 | DEV |                | GOOD |
 |     |<---------------|      |---->+--------+     +--------+     +--------+
 +-----+  merge others' +------+<----|        |---->|        |---->|        |
            changes                  | GROUP  |     | GROUP  |     |   ...  |
                                     |  DEV   |     |  GOOD  |     |        |
 +-----+                +------+---->|        |<----|        |<----|        |
 |     |--------------->|      |<----+--------+     +--------+     +--------+
 | DEV |                | GOOD |
 |     |<---------------|      |
 +-----+                +------+

known good versions
back out bad changes

+-profiling -- coverage -- automatic unit test & regression test selection
| automatic local control
|debug code
| memory leak checker
| preconditions -- integrated with unit tests
| postconditions -- integrated with unit tests
| automatic local control
| trace (printf) -- automatic divide and conquer search for crash and for bug
|   => create unit test
|
|known good points of execution
| input /save button/environment
|code management
| database
| purpose
| precondtions
| postconditions
| alternate functions/classes
+-action/call stack

object invariants/states

code optimizer
 find high-traffic code
  try to optimize
  flag for optimization
 find memory-intensive objects
  try to optimize

----------------------------


C Flat Data Schema

Instruction:
  (single C statement causing some action, as opposed to data declaration)

Sequence Point:

Instruction Sequence:
  Sequence Point
  Sequence Point, Instruction, Sequence Point
  Sequence Point, Instruction, Sequence Point, Instruction, Sequence Point
  etc.

Comment
  A note of any type related to an Instruction Sequence

Equivalent Representation
  A comment or block of code or image, etc. that is equivalent in some way to an Instruction Sequence.
  "++i;" <==> "i=i+1" <==> "Increment i" <==> [.gif] <==> [generated assembly code]


-------------------------------------------------------------------------------


File Format: Extend SUO-KIF to represent relationships between sections of
another file

  file func_ab.c
  +-------------------------------------------------+
01|/* function A */                                 |
02|void int func_a(int i, int j, int k)             |
03|{                                                |
04|  int p = abs(i + j);                            |
05|  int q = j + 2 * k;                             |
06|  while (q > p)                                  |
07|  {                                              |
08|     p *= 2;                                     |
09|     q -= p;                                     |
10|  }                                              |
11|  return q;                                      |
12|}                                                |
13|                                                 |
14|void int func_b(int i, int j, int k)             |
15|{                                                |
16|  int a_ijk = func_a(i, j, k);                   |
17|  int a_jik = func_a(j, i, k);                   |
18|  int a_kji = func_a(k, j, i);                   |
19|  return (a_ijk > a_jik) ? a_kji*a_kji : a_kji;  |
20|}                                                |
21|                                                 |
  +-------------------------------------------------+

  image_func_ab.jpg
  +---------------------------
  |
  |
  |


  file rel.cf
  +--------------------
  | (instance integer_variable var_1)
  | (instance integer_variable var_2)
  | (instance integer_variable var_3)
  | (var_1 location func_ab.c:line16+6:line16+11)
  | (var_1 name "a_ijk")
  | (var_1 comment "this is the first variable")
  | (shape1 -- rectangular area of image_func_ab.jpg)
  | (relationship <unstated relation> shape1 var1)
  |


----------------------------

Send & receive messages
Document locations where messages are sent and received,
or data is stored and retrieved.  This is implemented
as a relation with links to send locations and links
to retrieve locations.  Need not even be on the same
device.


-------------------------------------------------------------------------------
Markup Languages / Annotation Systems

http://techcrunch.com/2007/04/10/5-ways-to-mark-up-the-web/
protonotes.com
https://www.diigo.com/
http://en.wikipedia.org/wiki/ShiftSpace
http://en.wikipedia.org/wiki/Reframe_It
http://en.wikipedia.org/wiki/Web_annotation
http://en.wikipedia.org/wiki/Text_annotation
https://gitorious.org/stet
https://lite.co-ment.com/
http://en.wikipedia.org/wiki/Org-mode
http://orgmode.org/
https://github.com/reddit/
https://github.com/reddit/reddit/blob/master/r2/r2/models/link.py